Connect with us

AI Research

Interpretable artificial intelligence for modulated metasurface antenna design using SHAP and MLP

Published

on


For dataset generation, a rapid and accurate analytical model is essential to efficiently produce a large amount of data within a short timeframe. Two well-established approaches for analyzing modulated metasurface antennas are the MoM36,43,50 and the Flat Optics framework38,39. Both methodologies utilize a homogeneous impedance boundary condition (IBC) in their analytical formulations. Employing a homogeneous IBC eliminates the need for mesh generation and assignment of basis functions at the unit-cell level, greatly enhancing computational efficiency. This increased efficiency streamlines the optimization process and enables extensive sensitivity analyses of modulated metasurface antennas.

Fig. 1 depicts a conceptual schematic of a modulated metasurface antenna. The antenna comprises a sinusoidally modulated anisotropic impedance surface implemented by subwavelength, locally periodic patches printed on a grounded dielectric substrate32. A vertical monopole positioned at the antenna’s center excites the metasurface, generating a cylindrical surface wave. As this wave propagates across the modulated IBC, a portion of its energy is progressively transitions into a radiative mode characterized by specific linear and angular momentum states.

The modulation index of the surface impedance can be uniform or tapered, directly influencing the aperture field amplitude distribution. The inset on the right side of Fig. 1 illustrates a tapered modulation of the surface impedance. The taper shape significantly affects the SLL and beamwidth; thus, by carefully selecting the taper profile, desirable radiation characteristics can be achieved.

Fig. 1

Conceptual schematic of a modulated metasurface antenna featuring a tapered IBC modulation (right inset), which generates a broadside pencil beam characterized by a specific SLL and HPBW (left inset).

In this paper, we employ the Flat Optics framework to analyze the antenna structure. This method accurately determines the phase constant and leakage coefficient of leaky-wave modulated metasurfaces. In the general case, the modulation form of the surface impedance tensor can be expressed as38 :

$$\begin{aligned} \underline{\underline{Z}}_s(\vec {\rho }) = Z_{\rho \rho }(\vec {\rho }) \hat{\rho }\hat{\rho } + Z_{\rho \phi }(\vec {\rho }) (\hat{\rho }\hat{\phi } + \hat{\phi }\hat{\rho }) + Z_{\phi \phi }(\vec {\rho }) \hat{\phi }\hat{\phi } \end{aligned}$$

(1)

where

$$\begin{aligned} \begin{aligned} Z_{\rho \rho }(\vec {\rho }) =&jX_0 [1 + m_\rho (\vec {\rho }) \cos (\Psi _{\rho \rho }(\vec {\rho }))] \\ Z_{\rho \phi }(\vec {\rho }) =&jX_0 m_\phi (\vec {\rho }) \cos (\Psi _{\rho \phi }(\vec {\rho })) \\ Z_{\phi \phi }(\vec {\rho }) =&jX_0 [1 – m_\rho (\vec {\rho }) \cos (\Psi _{\phi \phi }(\vec {\rho }))] \end{aligned} \end{aligned}$$

(2)

and \(\Psi _{\rho \rho }(\vec {\rho })\), \(\Psi _{\rho \phi }(\vec {\rho })\), and \(\Psi _{\phi \phi }(\vec {\rho })\) are the modulation phases determining the direction and angular momentum states of the radiated wave. The coefficients \(m_\rho (\vec {\rho })\) and \(m_\phi (\vec {\rho })\) are the modulation indices playing fundamental roles in the leakage.

To calculate the modulation phases and indices, the aperture field estimation (AFE) method40,51 is employed. Within the AFE framework, the surface impedance required to synthesize the tangential aperture field, \(\vec {E}_{ap} (\vec {\rho }) = E_{ax}(\vec {\rho }) \hat{x} + E_{ay}(\vec {\rho }) \hat{y}\), can be expressed as follows40:

$$\begin{aligned} \underline{\underline{Z}}_s(\vec {\rho }) \cdot \hat{\rho } = j X_0 \left[ \hat{\rho } + 2 \Im \left\{ \frac{\vec {E}_{ap}(\vec {\rho })}{-E_0 H_1^{(2)}(k_{sw}\rho )}\right\} \right] \end{aligned}$$

(3)

where \(H_1^{(2)}(k_{sw}\rho )\) denotes the Hankel function of the second kind and first order, representing the reference wave generated by the monopole feed. The reactance \(X_0\) is the average surface reactance and controls the slope of the dispersion curve52. The monopole is responsible for generating the \(TM_0\) surface mode to excite the hologram. The coefficient \(k_{sw} = \beta _{sw} – j \alpha _{sw}\) represent the fundamental surface wavenumber, and its exact value is determined using the Flat Optics method. When selecting the tangential aperture field distribution, it is important to consider the following four parameters:

  • Beam orientation, which is determined by the wave propagation vector. In this paper, the beam is oriented in the broadside direction.

  • Beam shape and sidelobe level, which can be controlled by appropriately selecting the amplitude distribution of the aperture field. In this work, to achieve better control over the sidelobe level and half-power beamwidth, a non-uniform amplitude distribution is considered, which will be discussed in this section.

  • Wave polarization, which is influenced by the ratio of the x and y components of the aperture field. In the present work, right-hand circular polarization is chosen for the radiated field.

  • Topological charge of the radiated beam, which is determined by the phase of the aperture field. In this paper, a pencil beam with zero topological charge is selected.

Considering the above parameters, the aperture field distribution may be assumed in the following form51:

$$\begin{aligned} \vec {E}_{ap}(\vec {\rho }) = \frac{E_0}{\sqrt{2\pi \rho k_{sw}}} e^{-\alpha _{sw}\rho } e^{-jk\rho \sin \theta _0 \cos (\phi – \phi _0)} [M_x(\vec {\rho })\hat{x} + M_y(\vec {\rho })\hat{y}] \end{aligned}$$

(4)

In the above equation, the pair (\(\theta _0\), \(\phi _0\)) represents the beam direction in space with wavenumber of \(k = 2\pi f \sqrt{\mu _0\epsilon _0}\), indicating the linear momentum state of the wave. The parameters \(M_x(\vec {\rho })\) and \(M_y(\vec {\rho })\) are modulation coefficients, and their amplitude distributions control the sidelobe level of the farfield pattern. Moreover, the relationship between them determines the polarization of the radiated wave. It is worth noting that the coefficient \(\frac{E_0}{\sqrt{2\pi \rho k_{sw}}}\) in (4) is introduced to ensure that, after substituting the equation into (3), the surface impedance depends solely on \(M_x(\vec {\rho })\) and \(M_y(\vec {\rho })\). Note that this coefficient is defined for synthesis simplicity and may be disregarded by the designer if desired. To better understand how the polarization can be controlled, we need to express the relationshiop between the aperture and radiated fields, which is as follows53:

$$\begin{aligned} & \vec {E}_{farfield}(r, \theta , \phi ) \approx j \frac{ke^{-jkr}}{2\pi r} [F_\theta (\theta , \phi ) \hat{\theta } + F_\phi (\theta , \phi ) \hat{\phi }] \end{aligned}$$

(5)

$$\begin{aligned} & F_\theta (\theta , \phi ) = \tilde{E}_{ax}(\theta , \phi ) \cos \phi + \tilde{E}_{ay}(\theta , \phi ) \sin \phi \end{aligned}$$

(6)

$$\begin{aligned} & F_\phi (\theta , \phi ) = \cos \theta (-\tilde{E}_{ax}(\theta , \phi ) \sin \phi + \tilde{E}_{ay}(\theta , \phi ) \cos \phi ) \end{aligned}$$

(7)

where \(\tilde{E}_{ax}(\theta , \phi )\) and \(\tilde{E}_{ay}(\theta , \phi )\) are the Fourier integrals of the x and y components of the aperture field, respectively:

$$\begin{aligned} & \tilde{E}_{ax}(\theta , \phi ) = \iint _{ap} E_{ax}(\vec {\rho ‘}) e^{jk \rho ‘ \cos (\phi – \phi ‘)} \rho ‘ d\rho ‘ d\phi ‘ \end{aligned}$$

(8)

$$\begin{aligned} & \tilde{E}_{ay}(\theta , \phi ) = \iint _{ap} E_{ay}(\vec {\rho ‘}) e^{jk \rho ‘ \cos (\phi – \phi ‘)} \rho ‘ d\rho ‘ d\phi ‘ \end{aligned}$$

(9)

The condition for circular polarization of the radiated field at (\(\theta _0\), \(\phi _0\)) requires that:

$$\begin{aligned} F_\phi (\theta _0, \phi _0) = e^{\pm j \frac{\pi }{2}} F_\theta (\theta _0, \phi _0) \end{aligned}$$

(10)

where the minus and plus signs represent the right-hand and left-hand polarizations, respectively. Substituting aperture field components into (6) and (7) and applying circular polarization condition, yields:

$$\begin{aligned} M_y(\vec {\rho }) = M_x(\vec {\rho }) \frac{\cos \theta _0 \sin \phi _0 + e^{\pm j \pi /2} \cos \phi _0}{\cos \theta _0 \cos \phi _0 – e^{\pm j \pi /2} \sin \phi _0} \end{aligned}$$

(11)

It should be noted that, in equation (4), the parameter \(\alpha _{sw}\) is initially unknown and must be determined analytically. Analytical methods for extracting \(\alpha _{sw}\) are discussed in detail in the following section.

By substituting (4) in (3) the modulation indices \(m_\rho (\vec {\rho })\) and \(m_\phi (\vec {\rho })\) can be expressed as:

$$\begin{aligned} \begin{aligned} m_\rho (\vec {\rho }) =&|M_x(\vec {\rho }) \cos \phi + M_y(\vec {\rho }) \sin \phi |\\ m_\phi (\vec {\rho }) =&|-M_x(\vec {\rho }) \sin \phi + M_y(\vec {\rho }) \cos \phi | \end{aligned} \end{aligned}$$

(12)

which directly influence the surface wave leakage coefficient and consequently control the antenna’s beamwidth and SLL. Intuitively, increasing the modulation index intensifies disturbances within the surface waveguide, thus increasing the leakage coefficient. This results in more efficient conversion from surface to radiating modes, reducing the effective radiating aperture and, therefore, decreasing the antenna gain. The modulation phases are also obtained as follows:

$$\begin{aligned} & \Psi _{\rho \rho }(\vec {\rho }) = \beta _{sw}\rho – k \rho \sin \theta _0 \cos (\phi -\phi _0) + \arg \{M_x(\vec {\rho }) \cos \phi + M_y(\vec {\rho }) \sin \phi \} \end{aligned}$$

(13)

$$\begin{aligned} & \Psi _{\rho \phi }(\vec {\rho }) = \beta _{sw}\rho – k \rho \sin \theta _0 \cos (\phi -\phi _0) + \arg \{-M_x(\vec {\rho }) \sin \phi + M_y(\vec {\rho }) \cos \phi \} \end{aligned}$$

(14)

It is worth mentioning that the parameter \(\Psi _{\phi \phi }(\vec {\rho })\) is not directly derived from equation (3) and consequently the modulation of \(X_{\phi \phi }\) has minimal influence on the synthesis, due to the fact that the aperture field is always quasi-TM mode40.

In this study, we aim to precisely control the SLL and HPBW of the antenna radiation pattern by suitably selecting the distribution of \(M_x(\vec {\rho })\) and \(M_y(\vec {\rho })\) across the metasurface. Various functional forms have previously been used for these coefficients: a Gaussian distribution was employed in54 to achieve uniform field tapering for inward and outward modes, an exponential distribution was utilized in36 to optimize gain while maintaining a low SLL, and a cosine distribution was selected in45 to minimize SLLs. In this paper, we adopt a combination of exponential and cosine functions, as defined by the following expression, to control the aperture field magnitude:

$$\begin{aligned} M_x(\vec {\rho }) = M_0 \cos \left( \frac{\pi \rho }{2\gamma _1 \rho _{max}}\right) \times {\left\{ \begin{array}{ll} \gamma _2 \left( 1 – e^{-\frac{\gamma _3 \rho }{\rho _{max}}}\right) , & \rho < \gamma _4 \rho _{max} \\ m_{max} – \gamma _5 \left( 1 – e^{\gamma _6 \frac{\rho – \gamma _4 \rho _{max}}{\rho _{max}}}\right) , & \rho> \gamma _4 \rho _{max} \end{array}\right. } \end{aligned}$$

(15)

Here, \(\rho _{\text {max}}\) represents the antenna’s maximum spatial extent from the center and \(m_{max} = \gamma _2 (1 – e^{-\gamma _3 \gamma _4})\).

The vector \(\gamma = [\gamma _1, \gamma _2, \gamma _3, \gamma _4, \gamma _5, \gamma _6]\) specified above defines the aperture field distribution, directly influencing the SLL and HPBW. In generating our dataset, we systematically vary the vector \(\gamma\). For simplicity, the parameter \(X_0\), representing the average reactance, is held constant. The primary goal of the Flat Optics method utilized here is to compute the surface wavenumber (\(k_{sw}\)) based on surface impedance parameters. Using the periodicity of boundary condition, the surface current can be expanded in terms of adiabatic Floquet modes38:

$$\begin{aligned} \vec {J}(\vec {\rho }) = \sum _{n = -\infty }^{\infty }(J_\rho ^{(n)}(\vec {\rho })\hat{\rho } + J_\phi ^{(n)}(\vec {\rho })\hat{\rho }) e^{-jn (\beta _{sw}\rho – k \sin \theta _0 \cos (\phi – \phi _0)\rho )} H_1^{(2)}(k_{sw}\rho ) \end{aligned}$$

(16)

By applying the gradient operator on the phase of the above expansion, the n’th-order Floquet mode wavevector can be derived as:

$$\begin{aligned} \vec {k}^{(n)}(\vec {\rho }) = \vec {\beta }^{(n)}(\vec {\rho }) – j\vec {\alpha }^{(n)}(\vec {\rho }) = \nabla _{\vec {\rho }}[k_{sw}\rho + n\beta _{sw}\rho – n k \sin \theta _0 \cos (\phi – \phi _0)\rho ] \end{aligned}$$

(17)

The n-indexed phase and attenuation constants can be written as:

$$\begin{aligned} \begin{aligned} \vec {\beta }^{(n)}(\vec {\rho }) =&\Re \{\nabla _t(k_{sw}\rho )\} + n k \sin \theta _0 [\cos (\phi – \phi _0) \hat{\rho } – \sin (\phi – \phi _0)\hat{\phi }]\\ \vec {\alpha }^{(n)}(\vec {\rho }) =&\alpha _{sw}(\vec {\rho }) = -\Im \{\nabla _t(k_{sw}\rho )\} \end{aligned} \end{aligned}$$

(18)

On the other hand, the IBC can be expanded in terms of higher-order Floquet modes as:

$$\begin{aligned} \vec {E}_t(\vec {\rho }) = \sum _n \vec {E}^{(n)}(\vec {\rho }) = \sum _n \underline{\underline{Z}}_s(\vec {\rho }) \cdot \vec {J}^{(n)}(\vec {\rho }) \end{aligned}$$

(19)

By substituting the Floquet expansion of the current distribution into the above equation and simplifying mathematically, we obtain:

$$\begin{aligned} \vec {E}^{(n)} = \underline{\underline{Z}}^{(0)} \cdot \vec {J}^{(n)} + \underline{\underline{Z}}^{(-1)} \cdot \vec {J}^{(n+1)} + \underline{\underline{Z}}^{(+1)} \cdot \vec {J}^{(n-1)} \end{aligned}$$

(20)

where \((\vec {\rho })\) is omitted for simplicity in the above equation. The tensors \(\underline{\underline{Z}}^{(0)}\) and \(\underline{\underline{Z}}^{(\pm 1)}\) are defined as follows:

$$\begin{aligned} \begin{aligned}&\underline{\underline{Z}}^{(0)} = j X_0 ( \hat{\rho }\hat{\rho } + \hat{\phi }\hat{\phi })\\&\underline{\underline{Z}}^{(\pm 1)} = \frac{jX_0}{2}[m_\rho (\vec {\rho }) e^{\mp j\Psi _{\rho \rho }(\vec {\rho })} \hat{\rho }\hat{\rho } – m_\rho (\vec {\rho }) e^{\mp j\Psi _{\phi \phi }(\vec {\rho })} \hat{\phi }\hat{\phi } + m_\phi (\vec {\rho }) e^{\mp j\Psi _{\rho \phi }(\vec {\rho })}(\hat{\rho }\hat{\phi } + \hat{\phi }\hat{\rho })] \end{aligned} \end{aligned}$$

(21)

Additionally, the n-th order electric field and surface current are related via Green’s function as follows:

$$\begin{aligned} {\vec {E}}^{(n)} = -j {{[} \underline{\underline{X}}_0^{-1}(\vec {k}^{(n)}) + \underline{\underline{X}}_g^{-1}(\vec {k}^{(n)})]}^{-1} \cdot \vec {J}^{(n)} \end{aligned}$$

(22)

where:

$$\begin{aligned} \begin{aligned}&\underline{\underline{X}}_0(\vec {k}^{(n)}) = – \eta _0 \frac{\sqrt{\vec {k}^{(n)} \cdot \vec {k}^{(n)} – k^2}}{k} \hat{\rho }\hat{\rho } + \eta _0 \frac{k}{\sqrt{\vec {k}^{(n)} \cdot \vec {k}^{(n)} – k^2}} \hat{\phi } \hat{\phi }\\&\underline{\underline{X}}_g(\vec {k}^{(n)}) = \left[ \eta _0 \frac{\sqrt{-\vec {k}^{(n)} \cdot \vec {k}^{(n)} + \epsilon _r k^2}}{\epsilon _r k}\hat{\rho }\hat{\rho } + \eta _0 \frac{k}{\sqrt{-\vec {k}^{(n)} \cdot \vec {k}^{(n)} + \epsilon _r k^2}} \hat{\phi }\hat{\phi }\right] \tan (h \sqrt{\epsilon _r k^2 – \vec {k}^{(n)} \cdot \vec {k}^{(n)}}) \end{aligned} \end{aligned}$$

(23)

By combining the Green’s function with the boundary condition, we obtain the following recursive relation:

$$\begin{aligned} ([\underline{\underline{X}}_0^{-1}(\vec {k}^{(n)}) + \underline{\underline{X}}_g^{-1}(\vec {k}^{(n)})]^{-1} – j \underline{\underline{Z}}^{(0)}) \cdot \vec {J}^{(n)} – j \underline{\underline{Z}}^{(-1)} \cdot \vec {J}^{(n + 1)} – j \underline{\underline{Z}}^{(+1)} \cdot \vec {J}^{(n – 1)} = 0 \end{aligned}$$

(24)

The above equation consists of an infinite number of nonlinear equations, recursively relating the n-th mode to the previous and next modes. To solve this equation, a finite number of modes must be considered, neglecting the effect of higher-order modes. By assuming that most of the power is concentrated in the 0-, and \(\pm 1\)-indexed modes, limiting the analysis to these modes can provide the desired accuracy51. The equation for the zeroth-order (0-indexed) Floquet mode can also be derived as:

$$\begin{aligned} (\underline{\underline{X}}_s^{(0)} + \underline{\underline{Z}}^{(-1)} \cdot [\underline{\underline{X}}_s^{(+1)}]^{-1} \cdot \underline{\underline{Z}}^{(+1)} + \underline{\underline{Z}}^{(+1)} \cdot [\underline{\underline{X}}_s^{(-1)}]^{-1} \cdot \underline{\underline{Z}}^{(-1)} ) \cdot \vec {J}^{(0)} = 0 \end{aligned}$$

(25)

where:

$$\begin{aligned} \underline{\underline{X}}_s^{(n)} = [\underline{\underline{X}}_0^{-1}(\vec {k}^{(n)}) + \underline{\underline{X}}_g^{-1}(\vec {k}^{(n)})]^{-1} – j {\underline{Z}}^{(0)} \end{aligned}$$

(26)

Since the zeroth-order Floquet mode always possesses a nonzero amplitude (even when modulation indices approach zero), a necessary condition for satisfying the equality is that the determinant of the coefficient tensor must be set to zero. This condition facilitates the derivation of the propagation constants.

Notably, the average reactance \(X_0\) and the distribution of the modulation index substantially influence the control of the phase and leakage constants of the fundamental (0-indexed) mode. To examine the effects of these parameters on propagation characteristics, we consider a uniform modulation coefficient distribution \(M_x(\vec {\rho }) = M_0\) and illustrate the variations of the phase constant \(\beta _{sw}\) and the leakage constant \(\alpha _{sw}\) with respect to changes in \(M_0\) and \(X_0\), as depicted in Fig. 2. Here, \(\beta _{sw}\) and \(\alpha _{sw}\) denote the real and imaginary components of the eigenvalue obtained from the eigenmode equation described in (25).

As shown in Fig. 2a, the phase constant of the fundamental mode consistently exceeds \(k\), confirming that the surface mode is invariably slow-wave, with its dispersion characteristic situated below the light line. The variation of the leakage constant \(\alpha _{sw}\) is presented in Fig. 2b. For \(X_0 < 280 j\Omega\), it is evident that an increase in \(M_0\) correlates with an elevated leakage factor. Consequently, precise local adjustments of the modulation coefficient allow for tailored tuning of the leakage factor, enabling precise control of the radiation pattern amplitude.

Fig. 2
figure 2

The variation maps of (a) \(\beta _{sw}\) and (b) \(\alpha _{sw}\) versus modulation index \(M_0\) and average reactance \(X_0\). The maps are obtained by solving eigenmode Equation in (25).

Additionally, the aperture field for the radiating mode can be calculated as follows:

$$\begin{aligned} {\vec {E}}^{(-1)} = j [\underline{\underline{Z}}^{(0)} \cdot [\underline{\underline{X}}_s^{(-1)}]^{-1} \cdot \underline{\underline{Z}}^{(-1)} – j \underline{\underline{Z}}^{(-1)}] \cdot \vec {J}^{(0)} \end{aligned}$$

(27)

Fig. 3
figure 3

Synthesized surface impedance and analytical result of the MMA with uniform impedance distribution: (a) \(X_{\rho \rho }\), \(X_{\rho \phi }\) and (c) normalized radiation pattern in \(\phi = 0^\circ\) plane.

Fig. 4
figure 4

Synthesized surface impedance and analytical result of the MMA with tapered impedance distribution: (a) \(X_{\rho \rho }\), \(X_{\rho \phi }\) and (c) normalized radiation pattern in \(\phi = 0^\circ\) plane.

To examine the effect of the modulation index on the SLL and HPBW, we present two illustrative examples. The operating frequency is set at 18 GHz, and the radiated wave polarization is considered right-hand circular. In both cases, the modulation coefficient \(M_0\) is fixed at 0.1, and the average reactance \(X_0\) is selected as \(0.8 \eta _0\).

In the first example, a uniform modulation distribution is assumed, such that \(M_x(\vec {\rho }) = M_0\). Under this condition, the aperture field magnitude decays exponentially with a leakage factor of \(\alpha _{sw}\). This decay results in a 5 dB reduction in sidelobe level compared to a case with a uniform field distribution. Fig. 3 illustrates the surface impedance distribution and the normalized radiation pattern in the \(\phi = 0^\circ\) plane for the antenna without modulation tapering. The corresponding half-power beamwidth in this configuration is approximately 4 degrees.

In the second example, we analyze an antenna with a tapered modulation profile defined by the parameter vector \(\gamma = [0.9, 1, 20, 0.1, 0.5, 1]\). Fig. 4 displays the surface impedance distribution and the resulting radiation pattern. In this case, the sidelobe level is reduced to −24.6 dB. This improvement is attributed to the smoother impedance variation at the edges of the aperture, which reduces wave diffraction and contributes to a more desirable radiation pattern.

The six parameters defining the impedance distribution play a determinative role in shaping both sidelobe level and bandwidth. However, discerning their relative significance through isolated case studies remains nontrivial. In the following section, we demonstrate that a data-driven neural network, augmented with feature attribution analysis, enables accurate prediction of both metrics while providing quantitative insights into parameter influence.



Source link

AI Research

‘No honour among thieves’: M&S hacking group starts turf war

Published

on



A clash between rival criminal ransomware groups could result in corporate victims being extorted twice, cyber experts warn



Source link

Continue Reading

AI Research

Insurance Industry Rejects Proposed Moratorium on State Artificial Intelligence Regulation

Published

on

By


By Chad Hemenway

A proposed decade-long moratorium on state regulation of artificial intelligence has gained the attention of many, including those within the insurance industry.

The 10-year prohibition of AI regulation is contained within the sweeping tax bill, “One Big Beautiful Bill,” and would preempt laws and regulations already in place in dozens of states.

The National Association of Professional Insurance Agents (PIA) on June 16 sent a letter “expressing significant concern” to Senate leadership, who submitted a reconciliation budget bill that has already passed through the House of Representatives.

“PIA strongly urges the Senate to eliminate the reconciliation language enforcing a 10-year moratorium on state AI legislation and regulation, or explicitly exempt the insurance industry’s state regulation of AI because the industry is already appropriately regulated by the state,” said the letter, signed by Mike Skiados, CEO of PIA.

PIA referenced a model already adopted by the National Association of Insurance Commissioners (NAIC) that requires insurers to implement AI governance programs in accordance with all existing state and federal laws. Nearly 30 states have adopted the NAIC’s model on the use of AI by insurers.

Earlier in June, NAIC sent a letter to federal lawmakers following the passage of the bill in the House. The commissioners said state regulation has been effective in evolving market conditions.

“This system has not only protected consumers and fostered innovation but has also allowed for the flexibility and experimentation that is essential in a rapidly changing world,” said NAIC leadership in the letter. “By allowing states to develop and implement appropriately tailored regulatory frameworks, the system ensures that oversight is both robust and adaptable.”

“State insurance regulators understand that AI is a transformative technology that can be leveraged to benefit insurance policyholders by, among other things, creating new product offerings, improving the efficiency of the insurance business, and transforming the consumer experience.”

The language–more specifically the definition of AI within the bill–is also of concern. NAIC called it “overly broad” and questioned whether it not only applies to machine learning but “existing analytical tools and software that insurers rely on every day, including calculations, simulations, and stochastic forecasts…and a multitude of insurtech provided analytical systems for rate setting, underwriting, and claims processing.”

To that end, the American InsurTech Council (AITC) said it “strongly opposes” the AI state regulation moratorium, which it said would “create a dangerous vacuum in oversight during a period of rapid technological change.”

“Such a ban would undermine the foundational principles of insurance regulation in the United States and jeopardize consumer protections at a time when AI is rapidly transforming the way insurance is developed, priced, marketed, underwritten, and delivered,” said the AITC in a statement.

In May, state attorneys general in 40 states urged Congress to get rid of the moratorium proposal within the bill.

On June 16, the National Council of Insurance Legislators (NCOIL) in a statement said a ban on state regulation would “disrupt the overall markets that we oversee” and “wrongly curtail” state legislators’ ability to make policy.

The group said constituents have “been steadfast in asking for protections against the current unknowns surrounding AI, and they cannot wait 10 years for a state-based policy response.”

Topics
InsurTech
Legislation
Data Driven
Artificial Intelligence
Market

Interested in Ai?

Get automatic alerts for this topic.



Source link

Continue Reading

AI Research

Why it is vital that you understand the infrastructure behind AI

Published

on


As demand increases for AI solutions, the competition around the huge infrastructure required to run AI models is becoming ever more fierce. This affects the entire AI chain, from computing and storage capacity in data centres, through processing power in chips, to consideration of the energy needed to run and cool equipment.

When implementing an AI strategy, companies have to look at all these aspects to find the best fit for their needs. This is harder than it sounds. A business’s decision on how to deploy AI is very different to choosing a static technology stack to be rolled out across an entire organisation in an identical way. 

Businesses have yet to understand that a successful AI strategy is “no longer a tech decision made in a tech department about hardware”, says Mackenzie Howe, co-founder of Atheni, an AI strategy consultant. As a result, she says, nearly three-quarters of AI rollouts do not give any return on investment.

Department heads unaccustomed to making tech decisions will have to learn to understand technology. “They are used to being told ‘Here’s your stack’,” Howe says, but leaders now have to be more involved. They must know enough to make informed decisions. 

While most businesses still formulate their strategies centrally, decisions on the specifics of AI have to be devolved as each department will have different needs and priorities. For instance legal teams will emphasise security and compliance but this may not be the main consideration for the marketing department. 

“If they want to leverage AI properly — which means going after best-in-class tools and much more tailored approaches — best in class for one function looks like a different best in class for a different function,” Howe says. Not only will the choice of AI application differ between departments and teams, but so might the hardware solution.

One phrase you might hear as you delve into artificial intelligence is “AI compute”. This is a term for all the computational resources required for an AI system to perform its tasks. The AI compute required in a particular setting will depend on the complexity of the system and the amount of data being handled.

The decision flow: what are you trying to solve?

Although this report will focus on AI hardware decisions, companies should bear in mind the first rule of investing in a technology: identify the problem you need to solve first. Avoiding AI is no longer an option but simply adopting it because it is there will not transform a business. 

Matt Dietz, the AI and security leader at Cisco, says his first question to clients is: what process and challenge are you trying to solve? “Instead of trying to implement AI for the sake of implementing AI . . . is there something that you are trying to drive efficiency in by using AI?,” he says.

Companies must understand where AI will add the most value, Dietz says, whether that is enhancing customer interactions or making these feasible 24/7. Is the purpose to give staff access to AI co-pilots to simplify their jobs or is it to ensure consistent adherence to rules on compliance?

“When you identify an operational challenge you are trying to solve, it is easier to attach a return on investment to implementing AI,” Dietz says. This is particularly important if you are trying to bring leadership on board and the initial investment seems high.

Companies must address further considerations. Understanding how much “AI compute” is required — in the initial phases as well as how demand might grow — will help with decisions on how and where to invest. “An individual leveraging a chatbot doesn’t have much of a network performance effect. An entire department leveraging the chatbot actually does,” Dietz says. 

Infrastructure is therefore key: specifically having the right infrastructure for the problem you are trying to solve. “You can have an unbelievably intelligent AI model that does some really amazing things, but if the hardware and the infrastructure is not set up to support that then you are setting yourself up for failure,” Dietz says. 

He stresses that flexibility around providers, fungible hardware and capacity is important. Companies should “scale as the need grows” once the model and its efficiencies are proven.

The data server dilemma: which path to take?

When it comes to data servers and their locations, companies can choose between owning infrastructure on site, or leasing or owning it off site. Scale, flexibility and security are all considerations. 

While on-premises data centres are more secure they can be costly both to set up and run, and not all data centres are optimised for AI. The technology must be scalable, with high-speed storage and low latency networking. The energy to run and cool the hardware should be as inexpensive as possible and ideally sourced from renewables, given the huge demand.

Space-constrained enterprises with distinct requirements tend to lease capacity from a co-location provider, whose data centre hosts servers belonging to different users. Customers either install their own servers or lease a “bare metal”, a type of (dedicated) server, from the co-location centre. This option gives a company more control over performance and security and it is ideal for businesses that need custom AI hardware, for instance clusters of high-density graphics processing units (GPUs) as used in model training, deep learning or simulations. 

Another possibility is to use prefabricated and pre-engineered modules, or modular data centres. These suit companies with remote facilities that need data stored close at hand or that otherwise do not have access to the resources for mainstream connection. This route can reduce latency and reliance on costly data transfers to centralised locations. 

Given factors such as scalability and speed of deployment as well as the ability to equip new modules with the latest technology, modular data centres are increasingly relied upon by the cloud hyperscalers, such as Microsoft, Google and Amazon, to enable faster expansion. The modular market was valued at $30bn in 2024 and its value is expected to reach $81bn by 2031, according to a 2025 report by The Insight Partners.

Modular data centres are only a segment of the larger market. Estimates for the value of data centres worldwide in 2025 range from $270bn to $386bn, with projections for compound annual growth rates of 10 per cent into the early 2030s when the market is projected to be worth more than $1tn. 

Much of the demand is driven by the growth of AI and its higher resource requirements. McKinsey predicts that the demand for data centre capacity could more than triple by 2030, with AI accounting 70 per cent of that.

While the US has the most data centres, other countries are fast building their own. Cooler climates and plentiful renewable energy, as in Canada and northern Europe, can confer an advantage, but countries in the Middle East and south-east Asia increasingly see having data centres close by as a geopolitical necessity. Access to funding and research can also be a factor. Scotland is the latest emerging European data centre hub.

Chart showing consumption of power by data centres

Choose the cloud . . . 

Companies that cannot afford or do not wish to invest in their own hardware can opt to use cloud services, which can be scaled more easily. These provide access to any part or all of the components necessary to deploy AI, from GPU clusters that execute vast numbers of calculations simultaneously, through to storage and networking. 

While the hyperscalers grab the headlines because of their investments and size — they have some 40 per cent of the market — they are not the only option. Niche cloud operators can provide tailored solutions for AI workloads: CoreWeave and Lambda, for instance, specialise in AI and GPU cloud computing.

Companies may prefer smaller providers for a first foray into AI, not least because they can be easier to navigate while offering room to grow. Digital Ocean boasts of its simplicity while being optimised for developers; Kamatera offers cloud services run out of its own data centres in the US, Emea and Asia, with proximity to customers minimising latency; OVHcloud is strong in Europe, offering cloud and co-location services with an option for customers to be hosted exclusively in the EU. 

Many of the smaller cloud companies do not have their own data centres and lease the infrastructure from larger groups. In effect this means that a customer is leasing from a leaser, which is worth bearing in mind in a world fighting for capacity. That said, such businesses may also be able to switch to newer data centre facilities. These could have the advantage of being built primarily for AI and designed to accommodate the technology’s greater compute load and energy requirements. 

. . . or plump for a hybrid solution

Another solution is to have a blend of proprietary equipment with cloud or virtual off-site services. These can be hosted by the same data centre provider, many of which offer ready-made hybrid services with hyperscalers or the option to mix and match different network and cloud providers. 

For instance Equinix supports Amazon Web Services with a connection between on-premises networks and cloud services through AWS Direct Connect; the Equinix Fabric ecosystem provides a choice between cloud, networking, infrastructure and application providers; Digital Realty can connect clients to 500 cloud service providers, meaning its customers are not limited to using large players. 

There are different approaches that apply to the hybrid route, too. Each has its advantages:

  • Co-location with cloud hybrid. This can offer better connectivity between proprietary and third-party facilities with direct access to some larger cloud operators. 

  • On premises with cloud hybrid. This solution gives the owner more control with increased security, customisation options and compliance. If a company already has on-premises equipment it may be easier to integrate cloud services over time. Drawbacks can include latency problems or compatibility and network constraints when integrating cloud services. There is also the prohibitive cost of running a data centre in house.

  • Off-site servers with cloud hybrid. This is a simple option for those who seek customisation and scale. With servers managed by the data centre provider, it requires less customer input but this comes with less control, including over security. 

In all cases whenever a customer relies on a third party to handle some server needs, it gives them the advantage of being able to access innovations in data centre operations without a huge investment. 

Arti Garg, the chief technologist at Aveva, points to the huge innovation happening in data centres. “It’s significant and it is everything from power to cooling to early fault detection [and] error handling,” she says.

Garg adds that a hybrid approach is especially helpful for facilities with limited compute capacity that rely on AI for critical operations, such as power generation. “They need to think how AI might be leveraged in fault detection [so] that if they lose connectivity to the cloud they can still continue with operations,” she says. 

Using modular data centres is one way to achieve this. Aggregating data in the cloud also gives operators a “fleet-level view” of operations across sites or to provide backup. 

In an uncertain world, sovereignty is important

Another consideration when assessing data centre options is the need to comply with a home country’s rules on data. “Data sovereignty” can dictate the jurisdiction in which data is stored as well as how it is accessed and secured. Companies might be bound to use facilities located only in countries that comply with those laws, a condition sometimes referred to as data residency compliance. 

Having data centre servers closer to users is increasingly important. With technology borders springing up between China and the US, many industries must look at where their servers are based for regulatory, security and geopolitical reasons.

In addition to sovereignty, Garg of Aveva says: “There is also the question of tenancy of the data. Does it reside in a tenant that a customer controls [or] do we host data for the customer?” With AI and the regulations surrounding it changing so rapidly such questions are common.

Edge computing can bring extra resilience

One way to get around this is by computing “at the edge”. This places computing centres closer to the data source, so improving processing speeds. 

Edge computing not only reduces bandwidth-heavy data transmission, it also cuts latency, allowing for faster responses and real-time decision-making. This is essential for autonomous vehicles, industrial automation and AI-powered surveillance. Decentralisation spreads computing over many points, which will help in the event of an outage. 

As with modular data centres, edge computing is useful for operators who need greater resilience, for instance those with remote facilities in adverse conditions such as oil rigs. Garg says: “More advanced AI techniques have the ability to support people in these jobs . . . if the operation only has a cell or a tablet and we want to ensure that any solution is resilient to loss of connectivity . . . what is the solution that can run in power and compute-constrained environments?” 

Some of the resilience of edge computing comes from exploring smaller or more efficient models and using technologies deployed in the mobile phones sector.

While such operations might demand edge computing out of necessity, it is a complementary approach to cloud computing rather than a replacement. Cloud is better suited for larger AI compute burdens such as model training, deep learning and big data analytics. It provides high computational power, scalability and centralised data storage. 

Given the limitations of edge in terms of capacity — but its advantages in speed and access — most companies will probably find that a hybrid approach works best for them.

Chips with everything, CPUs, GPUs, TPUs: an explainer 

Chips for AI applications are developing rapidly. The examples below give a flavour of those being deployed, from training to operation. Different chips excel in different parts of the chain although the lines are blurring as companies offer more efficient options tailored to specific tasks. 

GPUs, or graphics processing units, offer the parallel processing power required for AI model training, best applied to complex computations of the sort required for deep learning. 

Nvidia, whose chips are designed for gaming graphics, is the market leader but others have invested heavily to try to catch up. Dietz of Cisco says: “The market is rapidly evolving. We are seeing growing diversity among GPU providers contributing to the AI ecosystem — and that’s a good thing. Competition always breeds innovation.”

AWS uses high-performance GPU clusters based on chips from Nvidia and AMD but it also runs its own AI-specific accelerators. Trainium, optimised for model training, and Inferentia, used by trained models to make predictions, have been designed by AWS subsidiary Annapurna. Microsoft Azure has also developed corresponding chips, including the Azure Maia 100 for training and an Arm-based CPU for cloud operations. 

CPUs, or central processing units, are the chips once used more commonly in personal computers. In the AI context, they do lighter or localised execution tasks such as operations in edge devices or in the inference phase of the AI process. 

Nvidia, AWS and Intel all have custom CPUs designed for networking and all major tech players have produced some form of chip to compete in edge devices. Google’s Edge TPU, Nvidia’s Jetson and Intel’s Movidius all boost AI model performance in compact devices. CPUs such as Azure’s Cobalt CPU can also be optimised for cloud-based AI workloads with faster processing, lower latency and better scalability. 

Bar chart of Forecast total capital expenditure on chips for “frontier AI” ($bn) showing Inference spending set to increase

Many CPUs use design elements from Arm, the British chip designer bought by SoftBank in 2016, on whose designs nearly all mobile devices rely. Arm says its compute platform “delivers unmatched performance, scalability, and efficiency”.

TPUs, or tensor processing units, are a further specification. Designed by Google in 2015 to accelerate the inference phase, these chips are optimised for high-speed parallel processing, making them more efficient for large-scale workloads than GPUs. While not necessarily the same architecture, competing AI-dedicated designs include AI accelerators such as AWS’s Trainium.

Breakthroughs are constantly occurring as researchers try to improve efficiency and speed and reduce energy usage. Neuromorphic chips, which mimic brain-like computations, can run operations in edge devices with lower power requirements. Stanford University in California, as well as companies including Intel, IBM and Innatera, have developed versions each with different advantages. Researchers at Princeton University in New Jersey are also working on a low-power AI chip based on a different approach to computation.

High-bandwidth memory helps but it is not a perfect solution

Memory capacity plays a critical role in AI operation and is struggling to keep up with the broader infrastructure, giving rise to the so-called memory wall problem. According to techedgeai.com, in the past two years AI compute power has grown by 750 per cent and speeds have increased threefold, while dynamic random-access memory (Dram) bandwidth has grown by only 1.6 times. 

AI systems require massive memory resources, ranging from hundreds of gigabytes to terabytes and above. Memory is particularly significant in the training phase for large models, which demand high-capacity memory to process and store data sets while simultaneously adjusting parameters and running computations. Local memory efficiency is also crucial for AI inference, where rapid access to data is necessary for real-time decision-making.

High bandwidth memory is helping to alleviate this bottleneck. While built on evolved Dram technology, high bandwidth memory introduces architectural advances. It can be packaged into the same chipset as the core GPU to provide lower latency and it is stacked more densely than Dram, reducing data travel time and improving latency. It is not a perfect solution, however, as stacking can create more heat, among other constraints.

Everyone needs to consider compatibility and flexibility

Although models continue to develop and proliferate, the good news is that “the ability to interchange between models is pretty simple as long as you have the GPU power — and some don’t even require GPUs, they can run off CPUs,” Dietz says. 

Hardware compatibility does not commit users to any given model. Having said that, change can be harder for companies tied to chips developed by service providers. Keeping your options open can minimise the risk of being “locked in”.

This can be a problem with the more dominant players. The UK regulator Ofcom referred the UK cloud market to the Competition and Markets Authority because of the dominance of three of the hyperscalers and the difficulty of switching providers. Ofcom’s objections included high fees for transferring data out, technical barriers to portability and committed spend discounts, which reduced costs but tied users to one cloud provider. 

Placing business with various suppliers offsets the risk of any one supplier having technical or capacity constraints but this can create side-effects. Problems may include incompatibility between providers, latency when transferring and synchronising data, security risk and costs. Companies need to consider these and mitigate the risks. Whichever route is taken, any company planning to use AI should make portability of data and service a primary consideration in planning. 

Flexibility is critical internally, too, given how quickly AI tools and services are evolving. Howe of Atheni says: “A lot of what we’re seeing is that companies’ internal processes aren’t designed for this kind of pace of change. Their budgeting, their governance, their risk management . . . it’s all built for that very much more stable, predictable kind of technology investment, not rapidly evolving AI capabilities.”

This presents a particular problem for companies with complex or glacial procurement procedures: months-long approval processes hamper the ability to utilise the latest technology. 

Garg says: “The agility needs to be in the openness to AI developments, keeping abreast of what’s happening and then at the same time making informed — as best you can — decisions around when to adopt something, when to be a little bit more mindful, when to seek advice and who to seek advice from.”

Industry challenges: trying to keep pace with demand

While individual companies might have modest demands, one issue for industry as a whole is that the current demand for AI compute and the corresponding infrastructure is huge. Off-site data centres will require massive investment to keep pace with demand. If this falls behind, companies without their own capacity could be left fighting for access. 

McKinsey says that, by 2030, data centres will need $6.7tn more capital to keep pace with demand, with those equipped to provide AI processing needing $5.2tn, although this assumes no further breakthroughs and no tail-off in demand. 

The seemingly insatiable demand for capacity has led to an arms race between the major players. This has further increased their dominance and given the impression that only the hyperscalers have the capital to provide flexibility on scale.

Column chart of Data centre capex (rebased, 2024 = 100) showing Capex is set to more than double by the end of the decade

Sustainability: how to get the most from the power supply

Power is a serious problem for AI operations. In April 2025 the International Energy Agency released a report dedicated to the sector. The IEA believes that grid constraints could delay one-fifth of the data centre capacity planned to be built by 2030. Amazon and Microsoft cited power infrastructure or inflated lease prices as the cause for recent withdrawals from planned expansion. They refuted reports of overcapacity.

Not only do data centres require considerable energy for computation, they draw a huge amount of energy to run and cool equipment. The power requirements of AI data centres are 10 times those of a standard technology rack, according to Soben, the global construction consultancy that is now part of Accenture. 

This demand is pushing data centre operators to come up with their own solutions for power while they wait for the infrastructure to catch up. In the short term some operators are looking at “power skids” to increase the voltage drawn off a local network. Others are planning long-term and considering installing their own small modular reactors, as used in nuclear submarines and aircraft carriers.

Another approach is to reduce demand by making cooling systems more efficient. Newer centres have turned to liquid cooling: not only do liquids have better thermal conductivity than air, the systems can be enhanced with more efficient fluids. Algorithms preemptively adjust the circulation of liquid through cold plates attached to processors (direct-to-chip cooling). Reuse of waste water makes such solutions seem green, although data centres continue to face objections in locations such as Virginia as they compete for scarce water resources.

The DeepSeek effect: smaller might be better for some

While companies continue to throw large amounts of money at capacity, the development of DeepSeek in China has raised questions such as “do we need as much compute if DeepSeek can achieve it with so much less?”. 

The Chinese model is cheaper to develop and run for businesses. It was developed despite import restrictions on top-end chips from the US to China. DeepSeek is free to use and open source — and it is also able to verify its own thinking, which makes it far more powerful as a “reasoning model” than assistants that pump out unverified answers.

Now that DeepSeek has shown the power and efficiency of smaller models, this should add to the impetus to a rethink around capacity. Not all operations need the largest model available to achieve their goals: smaller models less greedy for compute and power can be more efficient at a given job. 

Dietz says: “A lot of businesses were really cautious about adopting AI because . . . before [DeepSeek] came out, the perception was that AI was for those that had the financial means and infrastructure means.”

DeepSeek showed that users could leverage different capabilities and fine-tune models and still get “the same, if not better, results”, making it far more accessible to those without access to vast amounts of energy and compute.

Definitions

Training: teaching a model how to perform a given task.

The inference phase: the process by which an AI model can draw conclusions from new data based on the information used in its training

Latency: the time delay between an AI model receiving an input and generating an output.

Edge computing: processing on a local device. This reduces latency so is essential for systems that require a real-time response, such as autonomous cars, but it cannot deal with high-volume data processing.

Hyperscalers: providers of huge data centre capacity such as Amazon’s AWS, Microsoft’s Azure, Google Cloud and Oracle Cloud. They offer off-site cloud services with everything from compute power and pre-built AI models through to storage and networking, either all together or on a modular basis. 

AI compute: the hardware resources that run AI applications, algorithms and workloads, typically involving servers, CPUs, GPUs or other specialised chips. 

Co-location: the use of data centres which rent space where businesses can keep their servers.

Data residency: the location where data is physically stored on a server.

Data sovereignty: the concept that data is subject to the laws and regulations of the land where it was gathered. Many countries have rules about how data is gathered, controlled, stored and accessed. Where the data resides is increasingly a factor if a country feels that its security or use might be at risk.



Source link

Continue Reading

Trending