1 Introduction

Next-generation wireless communication systems are required to deliver information in an extremely fast, more trustworthy, low-latency, and secure manner. Recent communications networks have emerged to use the millimeter-wave (mmWave) frequency band to provide innovative features and enable high data rate applications, such as high-quality video transmission, vehicle-to-infrastructure communications, intra-, and inter-vehicle messaging, device-to-device communications, and Internet of Things frameworks, and tactile Internet [6, 7]. ISAC can be implemented in several forms. For example, dual-function radar-communication (DRFC) systems share the same spectrum and hardware for both wireless communications and radar sensing, thereby achieving high spectral and cost efficiency [8,9,10,11].

Recently, reconfigurable intelligent surface (RIS) has emerged as a promising means for the next-generation wireless communication systems to boost the channel capacity and increase service coverage [12]. RIS is a metasurface made up of a large number of passive reflecting elements, typically in a rectangular shape, in which each element can be digitally controlled to adjust the amplitude and/or phase of the incident signal, thereby allowing optimized control of the propagation channels [13,14,15]. The capability of RIS to transform the wireless propagation environments which are traditionally considered unmanageable into controllable ones opens a new direction to change the paradigm of wireless communications and sensing with much higher capability and flexibility.

Naturally, RIS is found to be attractive in ISAC applications. For example, RIS can assist the base station (BS) to better communicate and detect communication users (CUs) and targets that are outside of the line-of-sight (LOS) region of the BS [16,17,18,\(L_u\) uncorrelated multipath. Denote the 2-D DOA of the \(l_u\)-th path from the CU and observed at the RIS as \(\{\theta _{l_u}, \phi _{l_u}\}\), \(l_u=1, \cdots , L_u\), where \(\theta _{l_u}\in [-{\pi }/2,{\pi }/2]\) and \(\phi _{l_u}\in [-{\pi }/2,{\pi }/2]\) are respectively the elevation and the azimuth angles. The 2-D steering vector \({{\textbf {a}}}(\theta _{l_u},\phi _{l_u})\) at the RIS is expressed as [36]

$$\begin{aligned} {{\textbf {a}}}(\theta _{l_u},\phi _{l_u}) = {{\textbf {a}}}_z(\theta _{l_u})\otimes {{\textbf {a}}}_x(\phi _{l_u}), \end{aligned}$$
(1)

where

$$\begin{aligned} {{\textbf {a}}}_x(\phi _{l_u})= & {} [1, e^{-j\frac{2\pi }{\lambda }d\sin (\phi _{l_u})},\ldots , e^{-j\frac{2\pi }{\lambda }(N_x-1)d\sin (\phi _{l_u})}]^{\textrm{T}}, \end{aligned}$$
(2)
$$\begin{aligned} {{\textbf {a}}}_z(\theta _{l_u})= & {} [1, e^{-j\frac{2\pi }{\lambda }d\sin (\theta _{l_u})},\ldots , e^{-j\frac{2\pi }{\lambda }(N_z-1)d\sin (\theta _{l_u})}]^{\textrm{T}}. \end{aligned}$$
(3)

Since the CU contains a single antenna, the channel of the CU-RIS link is expressed as

$$\begin{aligned} {{\textbf {h}}}_{r} = \sum \limits _{l_u=1}^{L_{u}} \beta _{l_u}{{\textbf {a}}}(\theta _{l_u}, \phi _{l_u}), \end{aligned}$$
(4)

where \(\beta _{l_u}\) is the path gain. The signal vector received at the RIS corresponding to the x- and the z-axes are respectively given as [35, 36]

$$\begin{aligned} {{\textbf {y}}}_x(t)= & {} \sum \limits _{l_u=1}^{L_u} \beta _{l_u}{{\textbf {a}}}_{x}(\phi _{l_u}) { s}_{u}(t)+ {{\textbf {n}}}_x(t) = {{\textbf {A}}}_x {{\textbf {s}}}(t) + {{\textbf {n}}}_x(t), \end{aligned}$$
(5)
$$\begin{aligned} {{\textbf {y}}}_z(t)= & {} \sum \limits _{l_u=1}^{L_u} \beta _{l_u}{{\textbf {a}}}_{z}(\theta _{l_u}) { s}_{u}(t)+ {{\textbf {n}}}_z(t) = {{\textbf {A}}}_z {{\textbf {s}}}(t) + {{\textbf {n}}}_z(t), \end{aligned}$$
(6)

where \({s}_u(t)\) denotes the signal transmitted by the CU, \({{\textbf {s}}}(t) = [\beta _1, \beta _2, \cdots , \beta _{L_u}]^{\textrm{T}} s_u(t)\), \({{\textbf {A}}}_x= [{{\textbf {a}}}_x(\phi _1), {{\textbf {a}}}_x(\phi _2),\cdots ,{{\textbf {a}}}_x(\phi _{L_u})]\), and \({{\textbf {A}}}_z= [{{\textbf {a}}}_z(\theta _1), {{\textbf {a}}}_z(\theta _2),\cdots ,{{\textbf {a}}}_z(\theta _{L_u})]\). In addition, \({{\textbf {n}}}_x(t) \sim {\mathcal{C}\mathcal{N}}(0, \sigma _{\textrm{n}}^2 {{\textbf {I}}}_{N_x})\) and \({{\textbf {n}}}_z(t) \sim {\mathcal{C}\mathcal{N}}(0, \sigma _{\textrm{n}}^2 {{\textbf {I}}}_{N_z})\) are the additive white Gaussian noise (AWGN) vectors observed at the x- and z-axis direction elements.

In the sensing mode, the signals are only observed at the two sparse subarrays respectively consisting of \({{{\bar{N}}}}_x\) and \({{{\bar{N}}}}_z\) elements. We define binary mask matrices for the two subarrays, \({{\textbf {U}}}_x \in {\mathbb {C}}^{W_x\times N_x}\) and \({{\textbf {U}}}_z \in {\mathbb {C}}^{W_x\times N_z}\), given as

$$\begin{aligned} { [{{\textbf {U}}}_x]_{g,g} = {\left\{ \begin{array}{ll} 1, &{} gd \in {\mathbb {X}},\\ 0, &{} \text {otherwise}, \end{array}\right. } } \qquad { [{{\textbf {U}}}_z]_{g,g} = {\left\{ \begin{array}{ll} 1, &{} gd \in {\mathbb {Z}},\\ 0, &{} \text {otherwise}. \end{array}\right. } } \end{aligned}$$
(7)

Then, the observed signal vectors corresponding to the two subarrays become

$$\begin{aligned} {\tilde{{{\textbf {y}}}}_x}(t) = {{\textbf {U}}}_x {{\textbf {y}}}_x(t)\in {\mathbb {C}}^{W_x\times 1}, \qquad {\tilde{{{\textbf {y}}}}_z}(t) = {{\textbf {U}}}_z {{\textbf {y}}}_z(t)\in {\mathbb {C}}^{W_z\times 1}. \end{aligned}$$
(8)

Note that only \({{{\bar{N}}}}_x\) and \({{{\bar{N}}}}_z\) elements respectively in \({\tilde{{{\textbf {y}}}}_x}(t)\) and \({\tilde{{{\textbf {y}}}}_z}(t)\) corresponding to the active element positions are nonzero.

In Sect. 3, we will use the masked signals observed in the sparse subarrays to interpolate the missing elements within the respective subarray aperture and estimate the signal channels.

2.2.2 CU-BS channel

Similar to the CU-RIS link, we consider that the signal transmitted from the CU arriving at the BS through \(L_d\) multipath. The time-varying channel between the BS and the CU is formulated as

$$\begin{aligned} {{\textbf {h}}}_{d} = \sum \limits _{l_d=1}^{L_{d}} \gamma _{l_d}{{\textbf {f}}}(\theta _{l_d}), \end{aligned}$$
(9)

where \(\gamma _{l_d}\) is the \(l_d\)-th channel path gain and \({{\textbf {f}}}(\theta _{l_d})\in {\mathbb {C}}^{M \times 1}\) is the array steering vector at the BS for \(l_d=1, \cdots , L_d\). Note that we assume that the M antennas in the BS array are vertically placed so only the elevation components are considered.

2.2.3 BS-RIS channel

The channel between the BS, in which the M antennas form a vertically aligned ULA, and the RIS is assumed to be fixed and thus is known at the BS. This channel between the BS and the RIS can be decomposed into \(L_b \le \textrm{min}(M,N)\) independent paths, given as

$$\begin{aligned} {{\textbf {G}}} = \sum _{l_b=1}^{L_b}\alpha _b {{\textbf {a}}}(\theta _{l_r},\phi _{l_r}) {{\textbf {f}}}^{\textrm{H}}(\theta _{l_b}) = {{\textbf {A}}}_{l_r}\text {diag}({\varvec{\alpha }}_{{\textrm{BI}}}) {{\textbf {F}}}_{l_r}^{\textrm{H}} \in {{\mathbb {C}}}^{N \times M}, \end{aligned}$$
(10)

where, for the \(l_b\)-th path, \(\alpha _b\) denotes the gain, \({{\textbf {a}}}(\theta _{l_r},\phi _{l_r}) = {{\textbf {a}}}_z(\theta _{l_r})\otimes {{\textbf {a}}}_x(\phi _{l_r})\in {\mathbb {C}}^{N\times 1}\) is the corresponding steering vector at the RIS, \({{\textbf {f}}}^{\textrm{H}}(\theta _{l_b})\in {\mathbb {C}}^{M\times 1}\) is the steering vector at the BS, and \({\varvec{\alpha }}_{{\textrm{BI}}} = [\alpha _1,\alpha _2,\cdots ,\alpha _{L_b}]^{\textrm{T}}\). The channel is decomposed into \(L_b\) independent paths with \(\alpha _b\) denoting the gain of the \(l_b\)-th path.

2.2.4 RIS-target-RIS channel

To illustrate the offerings of RIS in providing localization of targets in the NLOS region of the BS, we consider potential target locations which have LOS to the RIS but not the BS. As such, the target locations are estimated based on signals observed at the RIS.

The RIS reflects the im**ing signals from the BS to the targets and receives the signal reflected from the target. We assume \(L_t\) targets in the scene. The direction of the \(l_t\)-th target observed at the RIS is denoted as \((\theta _{l_t},\phi _{l_t})\), which is shared for both the outgoing RIS-target path and the returning target-RIS path. The round-trip channel of the RIS-target-RIS link is modeled as

$$\begin{aligned} {{\textbf {H}}}_{t} = \sum _{l_t=1}^{L_t}\delta _{l_t} {{\textbf {b}}}(\theta _{l_t},\phi _{l_t}) {{\textbf {b}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) = {{\textbf {B}}}_{t}\text {diag}({\varvec{ \delta }}_{{\textrm{TI}}}) {{\textbf {B}}}_{t}^{\textrm{H}} \in {{\mathbb {C}}}^{N \times N}, \end{aligned}$$
(11)

where \(\delta _{l_t}\) combines the round-trip path gain and the target radar cross section (RCS), \({{\textbf {b}}}(\theta _{l_t},\phi _{l_t})= {{\textbf {b}}}_z(\theta _{l_t})\otimes {{\textbf {b}}}_x(\phi _{l_t}) \in {\mathbb {C}}^{N\times 1}\) denotes the array steering vector of the RIS from the target, and \({\varvec{\delta }}_{{\textrm{TI}}} = [\delta _1,\delta _2,\cdots ,\delta _{L_t}]^{\textrm{T}}\) corresponding to the \(l_t\)-th target for \(l_t = 1, \cdots , L_t\).

Similar to the CU-RIS channel considered in Sect. 2.2.1, the RIS only observes masked signals at the sparsely placed active element positions. The observed channel matrix corresponding to the signal received on the z-axis subarray denoted as the RIS-targets-RIS(z) link, is described as

$$\begin{aligned} {{\textbf {H}}}_{tz} = \sum _{l_t=1}^{L_t}\delta _{l_t}{{\textbf {U}}}_z {{\textbf {b}}}(\theta _{l_t}) {{\textbf {b}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) = {\tilde{{{\textbf {B}}}}}_{tz}\text {diag}({\varvec{\delta }}_{{\textrm{TI}}}) {{\textbf {B}}}_{t}^{\textrm{H}} \in {{\mathbb {C}}}^{W_z \times N}, \end{aligned}$$
(12)

where only \({{{\bar{N}}}}_z\) rows are nonzero. Similarly, the RIS-targets-RIS(x) link is expressed as

$$\begin{aligned} {{\textbf {H}}}_{tx} = \sum _{l_t=1}^{L_t}\delta _{l_t}{{\textbf {U}}}_x {{\textbf {b}}}(\phi _{l_t}) {{\textbf {b}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) = {\tilde{{{\textbf {B}}}}}_{tx}\text {diag}({\varvec{\delta }}_{{\textrm{TI}}}) {{\textbf {B}}}_{t}^{\textrm{H}} \in {{\mathbb {C}}}^{W_x \times N}, \end{aligned}$$
(13)

where only \({{{\bar{N}}}}_x\) rows are nonzero.

3 Channel estimation, joint beamforming, and target localization

In this section, we consider the channel estimation, joint beamforming at the BS and the RIS, and the target localization in terms of 2-D DOA estimation. These objectives are addressed in the following three phases. The first phase estimates the time-varying channel between the CU and the RIS using the L-shaped sparse subarrays at the RIS. In the second phase, we maximize the minimum beampattern gain of the RIS towards the desired sensing directions while ensuring the minimum SNR requirement at the CU under the maximum transmit power constraint at the BS. The objective of the third phase is to determine the target 2-D DOAs by the sparse active elements at the RIS.

3.1 Phase I: CU-RIS channel estimation

In the first phase, the signal vectors \({\tilde{{{\textbf {y}}}}}_x(t)\) and \({\tilde{{{\textbf {y}}}}}_z(t)\) observed at the two active subarrays of the RIS are used to estimate the uplink multipath channels between the CU and the RIS. The interpolation technique is applied to obtain the full covariance matrices of vectors \({{{\textbf {y}}}}_x\) and \({{{\textbf {y}}}}_z\) corresponding to all elements spanned by the active subarray aperture, namely, for elements located at positions \(p_0, p_0+1, \cdots , W_x -1\) and \(q_0, q_0+1, \cdots , W_z -1\).

3.1.1 Covariance matrix interpolation

Assuming that the noise is uncorrelated to the signals, the covariance matrices of \({\tilde{{{\textbf {y}}}}}_x(t)\) is expressed as:

$$\begin{aligned} {\tilde{{{\textbf {R}}}}}_{x} = {{\mathbb {E}}}[{\tilde{{{\textbf {y}}}}}_x(t){\tilde{{{\textbf {y}}}}}_x^{\textrm{H}}(t)]= {{\textbf {U}}}_x {{\textbf {A}}}_x{{\textbf {R}}}_s{{\textbf {A}}}_x^{\textrm{H}} {{\textbf {U}}}_x^{\textrm{T}}+ \sigma _{\textrm{n}}^2 {{\textbf {U}}}_x {{\textbf {U}}}_x^{\textrm{H}}, \end{aligned}$$
(14)

where \({{\textbf {R}}}_s= \textrm{diag}(\sigma _1^2, \sigma _2^2, \cdots , \sigma _{L_u}^2)\) is the source covariance matrix, \(\sigma _{l_u}^2 = \beta _{l_u}^2 \sigma _u^2\) represents the power of the \(l_u\)-th path signal, \(\sigma _u^2 = {{\mathbb {E}}}({|s_u(t)|^2})\) is the source signal power. Because of the sparse placement of the active RIS elements, the covariance matrices \({\tilde{{{\textbf {R}}}}}_{x}\) contain missing holes. We exploit the matrix interpolation of \({\tilde{{{\textbf {R}}}}}_{x}\) to obtain an estimate of the interpolated covariance matrices \({{{\textbf {R}}}}_{x} \in {{\mathbb {C}}}^{W_x \times W_x}\).

The matrix interpolation for the x-axis subarray is formulated as the following nuclear norm minimization problem [37]:

$$\begin{array}{*{20}l} {\mathop {\min }\limits_{\varvec{w}} } \hfill & {\left\| {\rm{\mathcal{T}}(\varvec{w}){\mathbf{Q}}_{x} - \widetilde{{\mathbf{R}}}_{x} } \right\|_{{\rm{F}}}^{2} + \zeta \left\| {\rm{\mathcal{T}}(\varvec{w})} \right\|_{*} } \hfill \\ {{\rm{s}}{\rm{.t}}{\rm{. }}} \hfill & {\rm{\mathcal{T}}(\varvec{w})\rm{ \succcurlyeq }0,} \hfill \\ \end{array}$$
(15)

where \({{\mathcal {T}}}({\varvec{w}}) \in {{\mathbb {C}}}^{W_x \times W_x}\) denotes the Hermitian and Toeplitz matrix with \({\varvec{w}} \in {{\mathbb {C}}}^{W_x \times 1}\) as its first column, \(\Vert {{\mathcal {T}}}({\varvec{w}}) \Vert _* = \text {tr}(\sqrt{{{\mathcal {T}}}^{\textrm{H}}({\varvec{w}}){{\mathcal {T}}}({\varvec{w}})})\) is the nuclear norm of \({{\mathcal {T}}}({\varvec{w}})\), and \(\zeta\) is a tunable regularization parameter. In addition, \({{\textbf {Q}}}_x = {{\textbf {U}}}_x{{\textbf {U}}}_x^{\textrm{T}}\) is the binary mask of the sparse covariance matrix. The obtained \({{\mathcal {T}}}({\varvec{w}})\) becomes the estimate of \({{{\textbf {R}}}}_{x}\), denotes as \({\hat{{{\textbf {R}}}}}_{x}\). We can similarly perform matrix interpolation at the z axis and obtain the estimate of the interpolated covariance matrix \({{\textbf {R}}}_{z}\), denoted as \({\hat{{{\textbf {R}}}}}_{z}\).

As the interpolated covariance matrices are full rank, subspace-based methods, such as multiple signal classification (MUSIC), can be applied to \({\hat{{{\textbf {R}}}}}_{x}\) to obtain the azimuth DOAs of the multipath signals. The elevation angles must be paired with the estimated azimuth angles and their estimation is discussed below.

3.1.2 Pair-matched 2-D DOA estimation

When there are multiple paths between the CU and the RIS, it is important to determine the correct pairing between the estimated azimuth and elevation angles. The array manifold matrix corresponding to the estimated azimuth angles \({{\hat{\phi }}}_1, {{\hat{\phi }}}_2, \cdots , {{\hat{\phi }}}_{L_u}\) can be constructed as \({\hat{{{\textbf {A}}}}}_x= [{{\textbf {a}}}_x({\hat{\phi }}_1), {{\textbf {a}}}_x({\hat{\phi }}_2),\cdots ,{{\textbf {a}}}_x({\hat{\phi }}_{L_u})]\). We will estimate the manifold matrix \({{\textbf {A}}}_z\) of the z-axis subarray from the following cross-covariance matrix between \({\tilde{{{\textbf {y}}}}}_x(t)\) and \({\tilde{{{\textbf {y}}}}}_z(t)\):

$$\begin{aligned} {\tilde{{{\textbf {R}}}}}_{xz} = {{{\mathbb {E}}}[{\tilde{{{\textbf {y}}}}}_x(t){\tilde{{{\textbf {y}}}}}_z^{\textrm{H}}(t)]}. \end{aligned}$$
(16)

Performing the eigendecomposition of the covariance matrices \({\hat{{{\textbf {R}}}}}_{x}\) and \({\hat{{{\textbf {R}}}}}_{z}\) yields in

$$\begin{aligned} {\hat{{{\textbf {R}}}}}_{x} = {\hat{{{\textbf {A}}}}}_{x}{\hat{{{\textbf {R}}}}}_s {\hat{{{\textbf {A}}}}}_{x}^{\textrm{H}} + \sigma _{\textrm{n}}^2 {{\textbf {I}}}_{W_x} = {\hat{{{\textbf {V}}}}}_{xs}({\hat{\varvec{\Lambda }}}_{xs} - \sigma _{\textrm{n}}^2 {{\textbf {I}}} ) {\hat{{{\textbf {V}}}}}_{xs}^{\textrm{H}} + \sigma _{\textrm{n}}^2 {{\textbf {I}}}_{W_x}, \end{aligned}$$
(17)

and

$$\begin{aligned} {\hat{{{\textbf {R}}}}}_{z} = {\hat{{{\textbf {A}}}}}_{z}{\hat{{{\textbf {R}}}}}_s{\hat{{{\textbf {A}}}}}_{z}^{\textrm{H}} + \sigma _{\textrm{n}}^2 {{\textbf {I}}}_{W_x} = {\hat{{{\textbf {V}}}}}_{zs}({\hat{{\varvec{\Lambda }}}}_{zs} - \sigma _{\textrm{n}}^2 {{\textbf {I}}} ) {\hat{{{\textbf {V}}}}}_{zs}^{\textrm{H}} + \sigma _{\textrm{n}}^2 {{\textbf {I}}}_{W_z}, \end{aligned}$$
(18)

where \({\hat{{{\textbf {V}}}}}_{xs}\) and \({\hat{{{\textbf {V}}}}}_{zs}\) denote the estimated signal subspaces for the two linear arrays, whereas \({\hat{{\varvec{\Lambda }}}}_{xs}\) and \({\hat{{\varvec{\Lambda }}}}_{zs}\) are the diagonal matrices containing the eigenvalues corresponding to the signal subspaces. As the position of the active elements for x and z-axis are identical, we can write \(W_x = W_z\) and \({{\textbf {I}}}\) \(= {{\textbf {I}}}_{W_x} = {{\textbf {I}}}_{W_z} \in {\mathbb {C}}^{W_x \times W_x}\). Exploiting the relationship between the components spanning the signal subspace in the above formulations, \({{\textbf {R}}}_{{{\textbf {s}}}}\) can be estimated as [38]

$$\begin{aligned} {\hat{{{\textbf {R}}}}}_s = {\hat{{{\textbf {A}}}}}_{x}^{\dagger } {\hat{{{\textbf {V}}}}}_{xs} ({\hat{{\varvec{\Lambda }}}}_{s} -\sigma _{\textrm{n}}^2 {{\textbf {I}}} ) {\hat{{{\textbf {V}}}}}_{xs}^{\textrm{H}} ({\hat{{{\textbf {A}}}}}_{x}^{\dagger })^{\textrm{H}}. \end{aligned}$$
(19)

Thus, an estimate of the array manifold matrix, \({\hat{{{\textbf {A}}}}}_{z}= [{{\textbf {a}}}_{z}{({\hat{\theta }}_1)}, {{\textbf {a}}}_{z}{({\hat{\theta }}_2)}, \cdots , {{\textbf {a}}}_{z}{({{\hat{\theta }}}_{L_u})}] \in {\mathbb {C}}^{W_z \times L_u}\), can be obtained from (16) as

$$\begin{aligned} \begin{aligned} {\hat{{{\textbf {A}}}}}_z =&({\hat{{{\textbf {R}}}}}_{s}^{-1} {\hat{{{\textbf {A}}}}}_{x}^{\dagger } {\tilde{{{\textbf {R}}}}}_{xz})^{\textrm{H}}. \end{aligned} \end{aligned}$$
(20)

Note that matrix \({\tilde{{{\textbf {R}}}}}_{xz}\) is not a Hermitian and Toeplitz matrix and thus cannot be directly interpolated using interpolation techniques utilizing such properties. Instead, we indirectly obtain the interpolated \({{\textbf {R}}}_{xz}\) from the following operations:

$$\begin{aligned} {\hat{{{\textbf {R}}}}}_{xz} = {\hat{{{\textbf {V}}}}}_{xs}({\hat{{\varvec{\Lambda }}}}_{xs} -\sigma _{\textrm{n}}^2 {{\textbf {I}}} )^{\frac{1}{2}} ({\hat{{\varvec{\Lambda }}}}_{zs} -\sigma _{\textrm{n}}^2 {{\textbf {I}}} )^{\frac{1}{2}} {\hat{{{\textbf {V}}}}}_{zs}^{\textrm{H}} . \end{aligned}$$
(21)

As a result, we can rewrite (20) as

$$\begin{aligned} \begin{aligned} {\hat{{{\textbf {A}}}}}_z =&({\hat{{{\textbf {R}}}}}_{s}^{-1} {\hat{{{\textbf {A}}}}}_{x}^{\dagger } {\hat{{{\textbf {R}}}}}_{xz})^{\textrm{H}}. \end{aligned} \end{aligned}$$
(22)

For the \(l_u\)-th path, \(l_u = 1,2, \cdots , L_u\), the elevation angle can be estimated as

$$\begin{aligned} {\hat{{\varvec{\theta }}}}_{l_u} = \text {arg}\, \underset{\theta }{\text {max}} \ { {{\textbf {a}}}_{z}^{\textrm{H}}(\theta )}[{\hat{{\textbf {A}}}}_z]_{:,l_u}. \end{aligned}$$
(23)

From the paired azimuth and elevation angle estimation, we can generate the array manifold matrix for the CU-RIS channel as \({\hat{{{\textbf {A}}}}} = \left[ {{\textbf {a}}}( {{\hat{\theta }}}_{1},{{\hat{\phi }}}_{1}), \cdots , {{\textbf {a}}}({{\hat{\theta }}}_{L_u}, {{\hat{\phi }}}_{L_u}) \right] \in {\mathbb {C}}^{N \times L_u}\).

3.1.3 Path gain estimation

Because the path gains are identical for the x- and z-axis subarrays, computation in one of these two subarrays will suffice. To estimate the path gain of the CU-RIS channel, the CU transmits pilot signal \(s_u(t)\) to the RIS. At the RIS, the received signal at the x-axis elements is given as

$$\begin{aligned} {{\textbf {y}}}_{x}(t) = {{\textbf {A}}}_{x}{{\textbf {g}}} { {s_u(t)}}+ {{\textbf {n}}}_x(t), \end{aligned}$$
(24)

where \({{\textbf {g}}} = [\beta _{1}, \beta _{2}, \cdots , \beta _{L_u}]^{\textrm{T}}\) represents the path gains and can be estimated from

$$\begin{aligned} {\hat{{{\textbf {g}}}}} = \frac{1}{\sigma _s^2}({{\textbf {A}}}_x^{\textrm{H}}{{\textbf {A}}}_x)^{-1}{{\textbf {A}}}_x^{\textrm{H}} \bar{{{\textbf {y}}}}_{x}, \end{aligned}$$
(25)

where \(\bar{{{\textbf {y}}}}_{x} ={\mathbb {E}}\{{{{\textbf {y}}}}_{x}(t) s_u^*(t)\}\). From the above DOAs and path gain estimation, we can reconstruct the CU-RIS multipath channel \({{\textbf {h}}}_r\).

Similarly, we can estimate the CU-BS channel at the BS which, however, does not require interpolation because all BS antennas are active. The normalized root-mean-square error (RMSE) of the CU-RIS channel at the RIS is estimated as

$$\begin{aligned} \text {Normalized RMSE} \triangleq {\sqrt{{\frac{1}{K_\mathrm{{cu}}}\sum _{k=1}^{K_{\textrm{cu}}}\frac{\Vert {\hat{{{\textbf {h}}}}}_r-{{\textbf {h}}}_r\Vert _{\textrm{F}}^2}{\Vert {{\textbf {h}}}_r\Vert _{\textrm{F}}^2}}}}, \end{aligned}$$
(26)

where \(K_\mathrm{{cu}}\) is the number of independent trials.

3.2 Phase II: joint BS and RIS beamforming optimization

In phase II, our objective is to maximize the minimum beampattern gain of the RIS towards the desired sensing angles, while ensuring the minimum SNR requirement at the CU under the maximum transmit power constraint at the BS. Let \({{\textbf {v}}} = [e^{j\psi _1},\cdots , e^{j\psi _N}]^{\textrm{T}} \in {\mathbb {C}}^{N\times 1}\) denote the reflective phase shift vector at the RIS where \(\varvec{\Phi }= \text {diag}({{\textbf {v}}})\). By combining the signals transmitted through the direct BS-CU link and the reflected BS-RIS-CU link, the received signal at the CU is expressed as

$$\begin{aligned} y_{CU}(t) = ({{\textbf {h}}}_{ r}^{\textrm{H}}{\varvec{\Phi }} {{\textbf {G}}}+ {{\textbf {h}}}_d^{\textrm{H}}){{\textbf {w}}}{ s}(t) + { n}_c(t), \end{aligned}$$
(27)

where \({{\textbf {w}}}\in {\mathbb {C}}^{M\times 1}\) denotes the transmit beamforming vector at the BS, \(\textrm{s}(t) \sim \mathcal{C}\mathcal{N}(0,\sigma _{{\textrm{s,BS}}}^2)\) is the transmitted random symbol, and \(n_c(t) \sim \mathcal{C}\mathcal{N}(0,\sigma _{{\textrm{n,CU}}}^2)\) represents the AWGN at the CU. The received SNR at the CU is computed as

$$\begin{aligned} \text {SNR}_{{\textrm{CU}}} = \frac{\sigma _{{\textrm{s,BS}}}^2}{\sigma _{{\textrm{n,CU}}}^2}|({{\textbf {h}}}_{ r}^{\textrm{H}}{\varvec{\Phi }} {{\textbf {G}}}+ {{\textbf {h}}}_d^{\textrm{H}}){{\textbf {w}}}|^2. \end{aligned}$$
(28)

Next, we consider the radar sensing towards the potential target locations which are assumed to be at the NLOS areas of the BS. In this case, we use the virtual LOS links created by the RIS reflection to sense the targets. The beampattern gain of the RIS towards the desired sensing angles are used as the sensing performance metric. The beampattern gain from the RIS towards the target angel \(({\theta _{l_t}, \phi _{l_t}})\) is given as

$$\begin{aligned} \begin{aligned} \mathrm{\rho }(\theta _{l_t},\phi _{l_t})&= {\mathbb {E}}(|{{\textbf {a}}}^{\textrm{H}}(\theta _{l_t}, \phi _{l_t}) {\varvec{\Phi }} {{\textbf {G}}}{{\textbf {w}}}{ s(t)}|^2)\\&= {{\textbf {a}}}^{\textrm{H}}(\theta _{l_t}, \phi _{l_t}) {\varvec{\Phi }} {{\textbf {G}}}{{\textbf {w w}}}^{\textrm{H}}{{\textbf {G}}}^{\textrm{H}} {\varvec{\Phi }}^{\textrm{H}} {{\textbf {a}}}(\theta _{l_t},\phi _{l_t}). \end{aligned} \end{aligned}$$
(29)

We are interested in sensing the prospective targets at \(L_t\) directions observed at the RIS. To achieve the aforementioned objective, i.e., maximizing the minimum beampattern gain at these \(L_t\) angles while ensuring the minimum SNR requirement at the CU under the maximum transmit power constraint at the BS, we formulate the following SNR-constrained minimum beampattern gain maximization problem,

$$\begin{aligned} \underset{{{\textbf {w}}},{\varvec{\Phi }}}{\text {max}} \ \ \underset{l_t\in {\mathcal {L}}}{\text {min}} \quad&{{\textbf {a}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) {\varvec{\Phi }} {{\textbf {G}}}{{\textbf {w w}}}^{\textrm{H}}{{\textbf {G}}}^{\textrm{H}} {\varvec{\Phi }}^{\textrm{H}} {{\textbf {a}}}(\theta _{l_t},\phi _{l_t}) \end{aligned}$$
(30a)
$$\begin{aligned} \text {s.t.} \qquad&{|(h_r^{\textrm{H}} {\varvec{\Phi }} {{\textbf {G}}}+ {{\textbf {h}}}_d^{\textrm{H}}){{\textbf {w}}}|^2}\ge \Gamma {{\sigma _{{\textrm{n,CU}}}^2}}, \end{aligned}$$
(30b)
$$\begin{aligned}&\Vert {{\textbf {w}}} \Vert _2^2 \le {P}_{0}, \end{aligned}$$
(30c)
$${\varvec{\Phi }}= \rm{diag} (e^{j\psi _1},\cdots , e^{j\psi _N}),$$
(30d)
$$\left| {v_{n} } \right| = 1,\;\;\forall n = 1,2, \cdots ,N,$$
(30e)

where \({\mathcal {L}} =\{1, 2, \cdots , L_t\}\), \(P_0\) is the maximum power allowed at the BS, and \(\Gamma\) is the required SNR by the CU. \(v_n\) is the n-th element of \({{\textbf {v}}}\) and the constraint \(|v_n| =1\) ensures that the RIS weights are unit modulus, thereby achieving phase-only beamforming which is desired for convenience RIS implementations [39]. This problem is highly nonconvex and thus is difficult to be directly solved. It can be solved, however, by using the techniques of alternating optimization and semidefinite relaxation (SDR) [16]. The BS beamforming weight vector \({{\textbf {w}}}\) is first optimized with an initial value of the \(\varvec{\Phi }\) and then optimize the RIS reflecting beamforming matrix \(\varvec{\Phi }\) with optimized \({{\textbf {w}}}\). These procedures are described in the following two subsections.

3.2.1 Transmit beamforming optimization at BS

First, we optimize the transmit beamformer \({{\textbf {w}}}\) in the above problem under a presumed reflective beamformer \({\varvec{\Phi }}\). This problem is formulated as

$$\begin{aligned} \underset{{\textbf {w}}}{\text {max}} \ \ \underset{{l_t}\in {\mathcal {L}}}{\text {min}} \quad&{{\textbf {a}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) {\varvec{\Phi }} {{\textbf {G}}}{{\textbf {w w}}}^{\textrm{H}}{{\textbf {G}}}^{\textrm{H}} {\varvec{\Phi }}^{\textrm{H}} {{\textbf {a}}}(\theta _{l_t},\phi _{l_t}) \end{aligned}$$
(31a)
$$\begin{aligned} \text {s.t.} \qquad&(30b), (30c). \end{aligned}$$
(31b)

To perform SDR, we introduce \({{\textbf {W}}} = {{\textbf {w}}}{{\textbf {w}}}^{\textrm{H}}\) with \({{\textbf {W}}}\succeq 0\) and \({\text {rank}{({{\textbf {W}}})} = 1}\). Let \({{\textbf {h}}} = {{\textbf {G}}}^{\textrm{H}}{\varvec{\Phi }}^{\textrm{H}} {{\textbf {h}}}_{r} + {{\textbf {h}}}_d\) denote the combined channel vector from the BS to the CU accounting for both BS-CU and BS-IRA-CU channels. Then, the transmit beamforming optimization in problem (31) is reformulated as,

$$\begin{aligned} \underset{{\textbf {w}}}{\text {max}} \ \ \underset{{l_t}\in {\mathcal {L}}}{\text {min}} \quad&{{\textbf {a}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) {\varvec{\Phi }} {{\textbf {G}}}{{\textbf {W}}}{{\textbf {G}}}^{\textrm{H}} {\varvec{\Phi }}^{\textrm{H}} {{\textbf {a}}}(\theta _{l_t},\phi _{l_t}) \end{aligned}$$
(32a)
$$\begin{aligned} \text {s.t.}\qquad&\textrm{tr} ({{\textbf {hh}}}^{\textrm{H}}{{\textbf {W}}})\ge \Gamma {\sigma _{{\textrm{n,CU}}}^2}, \end{aligned}$$
(32b)
$$\begin{aligned}&\textrm{tr} ({{\textbf {W}}}) \le {P}_{0}, \end{aligned}$$
(32c)
$$\begin{aligned}&{{\textbf {W}}}\succeq 0, \end{aligned}$$
(32d)
$$\begin{aligned}&{\text {rank}}({{\textbf {W}}}) = {1}. \end{aligned}$$
(32e)

However, this problem is still nonconvex due to the rank-one constraint on \({{\textbf {W}}}\) in (32e). Relaxing this rank-one constraint renders the SDR version of the problem (32), expressed as

$$\begin{aligned} \underset{{\textbf {w}}}{\text {max}} \ \ \underset{l_t\in {\mathcal {L}}}{\text {min}} \quad&{{\textbf {a}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t}) {\varvec{\Phi }} {{\textbf {G}}}{{\textbf {W}}}{{\textbf {G}}}^{\textrm{H}} {\varvec{\Phi }}^{\textrm{H}} {{\textbf {a}}}(\theta _{l_t},\phi _{l_t}) \end{aligned}$$
(33a)
$$\begin{aligned} \text {s.t.} \qquad&(32b), (32c), (32d). \end{aligned}$$
(33b)

This problem is a semidefinite programming (SDP) that can be efficiently solved using convex solvers, such as CVX. Once the optimal solution of \({{\textbf {W}}}\) is solved from (32) and denoted as \({{\textbf {W}}}^{{\textrm{opt}}}\), the optimized transit beamforming vector at the BS, \({{{\textbf {w}}}}\), is obtained as [16]

$$\begin{aligned} {\hat{{{\textbf {w}}}}}= ({{\textbf {h}}}^{\textrm{H}}{{\textbf {W}}}^{{\textrm{opt}}}{{\textbf {h}}})^{-\frac{1}{2}}{{\textbf {W}}}^{{\textrm{opt}}}{{\textbf {h}}}. \end{aligned}$$
(34)

3.2.2 Reflective beamforming optimization at RIS

Next, we optimize the reflective beamformer \({\varvec{\Phi }}\) in problem (30) when the transmit beamformer of the BS is obtained from (33) and (34). Then, the sensing beampattern gain from the RIS towards angle \(\{ \theta _{l_t}, \phi _{l_t}\}\) is given as

$$\begin{aligned} \rho (\theta _{l_t},\phi _{l_t}) = {{\textbf {v}}}^{\textrm{H}}{{\textbf {R}}}_1(\theta _{l_t},\phi _{l_t}){{\textbf {v}}}, \end{aligned}$$
(35)

where

$$\begin{aligned} {{\textbf {R}}}_1(\theta _{l_t},\phi _{l_t}) = \text {diag}({{\textbf {a}}}^{\textrm{H}}(\theta _{l_t},\phi _{l_t})){{\textbf {GWG}}}^{\textrm{H}}\text {diag}({{\textbf {a}}}(\theta _{l_t},\phi _{l_t})). \end{aligned}$$
(36)

Define

$$\begin{aligned} {{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}) = \begin{bmatrix} {{\textbf {R}}}_1(\theta _{l_t},\phi _{l_t}) &{} {{\varvec{0}}_{N\times 1}} \\ {{\varvec{0}}_{1\times N}} &{} 0 \\ \end{bmatrix}; \quad {\bar{{\textbf {v}}}} = \begin{bmatrix} {{\textbf {v}}}\\ 1\\ \end{bmatrix}. \end{aligned}$$
(37)

Then, by substituting (37) into (35), we have \(\rho (\theta _{l_t},\phi _{l_t})= {\bar{{\textbf {v}}}}^{\textrm{H}}{{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}){\bar{{\textbf {v}}}}\).

Let \({{\textbf {H}}} = \text {diag}({{\textbf {h}}}_r^{\textrm{H}}){{\textbf {G}}} \in {\mathbb {C}}^{N\times M}\). The received signal power at the CU can be written as \(|({{\textbf {h}}}_r^{\textrm{H}} {\varvec{\Phi }} {{\textbf {G}}}+ {{\textbf {h}}}_d^{\textrm{H}}){{\textbf {w}}}|^2= |({{\textbf {v}}}^{\textrm{H}}{{\textbf {H}}}+{{\textbf {h}}}_d^{\textrm{H}}){{\textbf {w}}}|^2\). Then, the output SNR constraint in (30b) is reformulated as,

$$\begin{aligned} ({{\textbf {H}}}{{\textbf {v}}}+{{\textbf {h}}}_d)^{\textrm{H}}{{\textbf {W}}}({{\textbf {H}}}{{\textbf {v}}}+{{\textbf {h}}}_d)\ge \Gamma {\sigma _{{\textrm{n,CU}}}^2}, \end{aligned}$$
(38)

which is equivalent to

$$\begin{aligned} {\bar{{\textbf {v}}}^{\textrm{H}}{{\textbf {R}}}_3}\bar{{\textbf {v}}} + {{\textbf {h}}}_d^{\textrm{H}}{{\textbf {W}}}{{\textbf {h}}}_d\ge \Gamma {\sigma _{{\textrm{n,CU}}}^2} \end{aligned}$$
(39)

with

$$\begin{aligned} {{\textbf {R}}}_3= \begin{bmatrix} {{\textbf {HWH}}}^{\textrm{H}} &{} {{\textbf {HWh}}}_d\\ {{\textbf {h}}}_d^{\textrm{H}}{{\textbf {WH}}}^{\textrm{H}} &{} 0\\ \end{bmatrix}. \end{aligned}$$
(40)

As a result, the optimization of \(\varvec{\Phi }\) in problem (30) becomes the optimization of \(\bar{{\textbf {v}}}\) in the following problem:

$$\begin{aligned} \underset{\bar{{\textbf {v}}}}{\text {max}} \ \ \underset{{l_t}\in {\mathcal {L}}}{\text {min}} \quad&{\bar{{\textbf {v}}}}^{\textrm{H}}{{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}){\bar{{\textbf {v}}}} \end{aligned}$$
(41a)
$$\begin{aligned} \text {s.t.} \qquad&{\bar{{\textbf {v}}}^{\textrm{H}}{{\textbf {R}}}_3}\bar{{\textbf {v}}} + {{\textbf {h}}}_d^{\textrm{H}}{{\textbf {W}}}{{\textbf {h}}}_d \ge \Gamma {\sigma _{{\textrm{n,CU}}}^2}, \end{aligned}$$
(41b)
$$|{\overline{{\textbf {v}}}_n}|=1, \ \ \forall n\in \{1,\cdots ,N+1\}.$$
(41c)

To solve \({\bar{{\textbf {v}}}}\) using SDR, we define \(\bar{{\textbf {V}}}= {\bar{{\textbf {v}}}\bar{{\textbf {v}}}^{\textrm{H}}}\) with \({\bar{{\textbf {V}}}}\succeq 0\) and \(\text {diag}{(\bar{{\textbf {V}}})}=1\). Noting \({\bar{{\textbf {v}}}}^{\textrm{H}}{{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}){\bar{{\textbf {v}}}}= \text {tr}({{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}){\bar{{\textbf {V}}}})\) and \({\bar{{\textbf {v}}}}^{\textrm{H}}{{\textbf {R}}}_3{\bar{{\textbf {v}}}}= \text {tr}({{\textbf {R}}}_3{\bar{{\textbf {V}}}) }\), the reflective beamforming optimization in problem (41) is reformulated as

$$\begin{aligned} \underset{\bar{{\textbf {V}}}}{\text {max}} \ \ \underset{l\in {\mathcal {L}}}{\text {min}}\quad&\text {tr}({{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}){\bar{{\textbf {V}}}}) \end{aligned}$$
(42a)
$$\begin{aligned} \text {s.t.} \qquad&\text {tr}({{\textbf {R}}}_3{\bar{{\textbf {V}}}}) + {{\textbf {h}}}_d^{\textrm{H}}{{\textbf {W}}}{{\textbf {h}}}_d \ge \Gamma {\sigma _{{\textrm{n,CU}}}^2}, \end{aligned}$$
(42b)
$$[{\overline{{\textbf {V}}}]_{n,n}}=1 \ \ \forall n\in \{1,\cdots ,N+1\},$$
(42c)
$$\begin{aligned}&\bar{{\textbf {V}}} \succeq {\varvec{0}}, \end{aligned}$$
(42d)
$$\begin{aligned}&\text {rank}({\bar{{\textbf {V}}}})=1. \end{aligned}$$
(42e)

Similarly, we relax the rank-one constraint and accordingly obtain the SDR version of the problem (42) as

$$\begin{aligned} \underset{\bar{{\textbf {V}}}}{\text {max}} \ \ \underset{l\in {\mathcal {L}}}{\text {min}} \quad&\text {tr}({{\textbf {R}}}_2(\theta _{l_t},\phi _{l_t}){\bar{{\textbf {V}}}}) \end{aligned}$$
(43a)
$$\begin{aligned} \text {s.t.} \qquad&(42b), (42c), (42d). \end{aligned}$$
(43b)

Let \(\bar{{\textbf {V}}}^{\textrm{opt}}\) denote the obtained optimal solution to \({\bar{{\textbf {V}}}}\) from problem (43). When it has a high rank, Gaussian randomization can be used to construct an approximate rank-one solution. At any iteration, if the updated gain is higher than the prior gain, the targets’ beampattern gain and \({{\textbf {v}}}\) are updated at the same time. If not, \({{\textbf {v}}}\) is again randomly generated.

It is noted that, with a sufficient number of randomizations, the objective value after solving the problem (42) will be monotonically non-decreasing [15]. As a result, the convergence of the proposed alternating optimization-based algorithm for solving the problem (30) is ensured.

3.3 Phase III: NLOS target localization

In this section, using the reflected RIS signal, we estimate the target 2-D DOA. The RIS has a sparse placement of the L-shaped active elements to allow for independent determination of the targets’ azimuth and elevation angles. Here, we demonstrate the estimation of the elevation angles of the targets using the RIS z-axis subarray. The signals received at the azimuth subarray can be similarly formulated and the azimuth angles of the targets can be similarly computed using the matrix interpolation and pair-matched MUSIC processes.

The echo channel reflected from the target over the RIS-reflected link, denoted as BS-RIS-targets-RIS(z), is given by

$$\begin{aligned} \begin{aligned} {{\textbf {G}}}_r(t)&= {{\textbf {H}}}_{tz}{\varvec{\Phi }}{{\textbf {G}}}\\&= {\tilde{{{\textbf {B}}}}}_{tz}\text {diag}({\varvec{\delta }}_{{\textrm{TI}}}) {{\textbf {B}}}_{t}^{\textrm{H}} {\varvec{\Phi }}{{\textbf {A}}}_{l_r}\text {diag}({\varvec{\alpha }}_{{\textrm{BI}}}) {{\textbf {F}}}_{l_r}^{\textrm{H}}. \end{aligned} \end{aligned}$$
(44)

As such, the received signals at the vertical RIS subarray sensors are given by

$$\begin{aligned} {{\textbf {y}}}_0(t) = \left( {{\textbf {G}}}_r(t) + {{\textbf {G}}} \right) {{\textbf {w}}}\textrm{s}(t) + {{\textbf {n}}}_0(t). \end{aligned}$$
(45)

For the estimation of the target directions, the received signal from the BS, \({{\textbf {Gw}}}s(t)\) in the above expression, act as interference. Because \({{\textbf {G}}}\) is fixed and known, we can remove this component and obtain the interference-free target echo signal as

$$\begin{aligned} \begin{aligned} {{\textbf {y}}}(t)&= {{\textbf {G}}}_r {{\textbf {w}}}\textrm{s}(t)+ {{\textbf {n}}}_0(t)\\&= {\tilde{{{\textbf {B}}}}}_{tz}\text {diag}({\varvec{\delta }}_{{\textrm{TI}}}) {{\textbf {B}}}_{t}^{\textrm{H}} {\varvec{\Phi }}{{\textbf {A}}}_{l_r}\text {diag}({\varvec{\alpha }}_{{\textrm{BI}}}) {{\textbf {F}}}_{l_r}^{\textrm{H}}{{\textbf {w}}}s(t) + {{\textbf {n}}}_0(t). \end{aligned} \end{aligned}$$
(46)

As the active elements on the RIS are in a sparse structure, similar to (15), we can perform matrix completion to fill in the missing elements and generate the interpolated covariance matrix \({{\hat{{{\textbf {R}}}}}_Z}\). \({{\textbf {B}}}_t\) at RIS can be obtained by following the same procedures of the pair-matched MUSIC algorithm and gain estimation as described in Sect. 3.1.1.

The normalized RMSE of the target response at the RIS is estimated as

$$\begin{aligned} \text {Normalized RMSE} \triangleq {\sqrt{{\frac{1}{K_t}\sum _{k=1}^{K_t}\frac{\Vert {\hat{{{\textbf {B}}}}}_t-{{\textbf {B}}}_t\Vert _{\textrm{F}}^2}{\Vert {{\textbf {B}}}_t\Vert _{\textrm{F}}^2}}}}, \end{aligned}$$
(47)

where \(K_t\) is the number of independent trials.

4 Simulation results

In this section, we provide simulation results to demonstrate the effectiveness of the proposed active RIS-assisted ISAC scheme and the superiority of the L-shaped hybrid ONRA-based sparse array structure over other L-shaped structures. The RIS contains \(N_x = N_z = 23\) elements in each dimension, rendering a total number of \(N = 529\) elements.

We use a small number of \({{\bar{N}}}=11\) active elements in all L-shaped sparse arrays and examine the channel estimation and target DOA estimation performance. Figure 2 shows the structures of different sparse active subarrays in a single axis. Each array uses 6 antenna elements. Both the x and z-axis directions use the same subarrays and they together form L-shape sparse active arrays. Note that the two subarrays share the element located at the 0-th position. For the hybrid ONRA configuration, the positions of the active elements along the x and the z axes are \({{\mathbb {X}}} = {{\mathbb {Z}}} = \{0,3,7,12,20,22\}\lambda /2\). It yields 16 nonnegative lags located as \({{\mathbb {D}}}_{\textrm{self}}^{X} = {{\mathbb {D}}}_{\textrm{self}}^{Z} = \{0,2,3,4,5,7,8,9,10,12,13,15,17,19,20,22\}\lambda /2\). It achieves the largest array aperture which coincides with the size of the RIS, rendering \(W_x = N_x = W_y = N_y = 23\) in this case.

Fig. 2
figure 2

Single-axis array structures of different types sparse arrays under consideration

Fig. 3
figure 3

RMSE of the CU-RIS channel versus the number of snapshots when the paths are closely spaced (transmit power = 10 dBm)

Fig. 4
figure 4

RMSE of the CU-RIS channel versus the number of snapshots when the paths are closely spaced (transmit power = 20 dBm)

Fig. 5
figure 5

RMSE of the CU-RIS channel versus the number of snapshots when the paths are well separated (transmit power = 10 dBm)

Fig. 6
figure 6

RMSE of the CU-RIS channel versus the angular separation of the multipath (transmit power = 10 dBm)

Fig. 7
figure 7

RMSE of the CU-RIS channel versus different transmit power of CU

Fig. 8
figure 8

RMSE of the CU-RIS channel versus the distance between CU and RIS (Transmit power = 10 dBm)

Fig. 9
figure 9

The optimized beam pattern gain versus the number of iterations

For comparison, we compare the performance of the L-shaped hybrid ONRA-based sparse array structure with L-shaped arrays consisting of orthogonally placed ULA structures and popularly used sparse array structures, including the coprime array [30], nested array [29], coprime array with minimum lag redundancies (CAMLR) [40], and super nested array (SNA) [41]. These five sparse array configurations being compared respectively achieve 5, 9, 12, 12, and 12 unique nonnegative co-array lags and have smaller array apertures as shown in Fig. 2 [34]. The transmit array of the BS is equipped with a vertically placed ULA with \(M = 4\) antennas.

The large-scale path loss for a CU-RIS path with distance r is given as \(\textrm{PL}(r)[\text {dB}]= 10\ \text {log}_{10}(4\pi f_c/c)^2+10 \, \alpha \, \text {log}_{10}(r/r_0)\), where \(f_c\), \(\alpha\), and \(r_0\) are the carrier frequency, the path loss exponent, and the reference distance, respectively [42]. In this paper, \(f_c=28\) GHz and \(r_0=1\ \text {m}\) are assumed.

The BS and the RIS have a fixed distance of \(d_{\text {BI}}=10.2\) m, and the distance between the RIS and the CU is \(d_{\text {IC}}=120\) m. There are two targets and their 2-D angels with respect to the RIS are respectively [\(10^\circ , 30^\circ\)] and [\(30^\circ , 10^\circ\)]. The two-way complex-valued path gain is \(\delta _{l_t} = \sqrt{{\lambda ^2k}/({64\pi ^3d_{\textrm{IT}}^4})}\) denotes the signal attenuation caused by the propagation from RIS to the target and then from the target to RIS. We assumed that both targets are equally distant from the RIS, where \(d_{\text {IT}}=10.2\) m is the distance between the RIS and target, and \(k = 7\) dBm denotes the radar cross-section.

For the CU, we assume \(L_u= 2\) paths for the CU-RIS channel \({{\textbf {h}}}_r\).

We consider two distinct scenarios. In the first scenario, we intend to evaluate how well the different array structures would function, and the multipath is chosen to be closely spaced with the LOS and their DOA separation observed at the RIS is only \(2^{\circ }\) in both azimuth and elevation. In the second scenario, we consider well separated multipath whose azimuth and elevation angles both deviate from the LOS by \(20^{\circ }\). The near and far spacing between this multipath can be used to determine the performance advantages of the hybrid ONRA array construction.

Figure 3 considers the first case with closely spaced multipath and compares the normalized RMSE performance of the estimated channels with respect to the number of snapshots for different array configurations. In this case, the CU’s transmit power is 10 dBm, whereas the noise power is \(-80\) dBm. The outcomes clearly demonstrate that the RIS with L-shaped hybrid ONRA active elements achieves noticeably higher channel estimation accuracy than other sparse array structures utilizing the same number of active elements (\({\bar{N}}=11\)). In Fig. 4, we increase the transmit power from the CU to 20 dBm, and it is observed that the RMSE performance of all array configurations improves due to the increased transmit power.

In Fig. 5, we consider the second scenario with well separated multipath signals. With a \(20^{\circ }\) separation in both azimuth and elevation, the normalized RMSE of all sparse structures is much lower and the differences between different array configurations are smaller.

Figure 6 depicts the CU-RIS channel estimation results for different angular separations between the CU-IRS multipath, where the transmit power from the CU is 10 dBm and the number of snapshots is 5000. When the difference between the two azimuth angles and the two elevation angles was only \(0.1^\circ\), the channel estimation accuracy was very poor for all array configurations in this figure. For all array designs, channel estimation performance improves with increasing multipath separation, but for ONRA structures, channel estimation performance is exceptionally accurate even with a \(1^\circ\) separation when compared to other structures. This figure further highlights the superior estimate performance of the hybrid ONRA over the other array architectures.

Fig. 10
figure 10

Normalized RMSE of the estimated target signal versus the number of snapshots for different array configurations

Figure 7 depicts the normalized RMSE performance of the estimated CU-RIS channels as the transmit power varies between 0 dBm and 30 dBm, where the number of snapshots is fixed to 5000, and the two paths are closely spaced with \(2^{\circ }\) separation in both azimuth and elevation. It is observed that the normalized RMSE generally reduces as the transmit power increases and, for this closely spaced multipath scenario, the hybrid ONRA-based active RIS configuration performs much better than other sparse array structures.

In Fig. 8, we compare the normalized RMSE of the estimated CU-RIS channel versus the separation between the CU and the IRS. The number of snapshots is 5000, the transmit power is 10 dBm, and the angular separation is \(2^{\circ }\) in both azimuth and elevation. It is observed that, as the distance increases, the signal is more attenuated, yielding higher normalized RMSE results. The ONRA structure consistently outperforms the other array configurations.

Figure 9 shows the convergence performance of the minimum beampattern gain obtained in phase II based on the alternating optimization-based algorithm. Here, the transmit power from BS is 40 dBm, the noise power is \(-80\) dBm, and \(\Gamma = 10\) dB is assumed. Because the convergence performance depends on randomization, the plotted results are obtained by averaging over 5 independent trials. For comparison, we also plotted the result with true CSI of the BS-RIS channel, BS-CU channel, and CU-RIS channel to be perfectly known at the BS during optimization. We can observe that most of the array configurations achieve convergence in 6–8 iterations. Sparse subarrays using the ONRA structure achieve the highest minimum beampattern gain which is very close to that obtained under the assumption that accurate CSI is known.

In phase III, we determine the target reflected signals. Figure 10 demonstrates the normalized RMSE performance versus the number of snapshots, where the transmit power from the BS is 40 dBm and the noise power is \(-109\) dBm. It is again verified that the hybrid ONRA structure offers better performance than other sparse array configurations.

5 Conclusion

In this paper, we presented an ISAC system assisted by a RIS with partial active elements. L-shaped sparse arrays are used for channel estimation and target DOA estimation. A nuclear norm-based interpolation technique was used to fill in the gaps in the covariance matrix and achieve improved channel estimation and target DOA estimation performance. The array and RIS gains are optimized to concurrently maintain a specific SNR requirement toward the communication user and beampattern gains toward the targets which are assumed to have LOS only with the RIS. Simulation findings demonstrate that our proposed hybrid ONRA RIS structure performs better than other contemporary RIS structures.