E‐CHAIM as a Model of Total Electron Content: Performance and Diagnostics

Here, we assess to what extent the Empirical Canadian High Arctic Ionospheric Model (E‐CHAIM) can reproduce the climatological variations of vertical Total Electron Content (vTEC) in the Canadian sector. Within the auroral oval and polar cap, E‐CHAIM is found to exhibit Root Mean Square (RMS) errors in vTEC as low 0.4 TECU during solar minimum summer but as high as 5.0 TECU during solar maximum equinox conditions. These errors represent an improvement of up to 8.5 TECU over the errors of the International Reference Ionosphere (IRI) in the same region. At sub‐auroral latitudes, E‐CHAIM RMS errors range between 1.0 and 7.4 TECU, with greatest errors during the equinoxes at high solar activity. This represents an up to 0.5 TECU improvement over the IRI during summer but worse performance by up to 2.4 TECU during the winter. Comparisons of E‐CHAIM performance against in situ measurements from the European Space Agency's Swarm mission are also conducted, ultimately finding behavior consistent with that of vTEC. In contrast to the vTEC results, however, E‐CHAIM and the IRI exhibit comparable performance at Swarm altitudes, except within the polar cap, where the IRI exhibits systematic underestimation of electron density by up to 1.0 × 1011 e/m3. Conjunctions with mid‐latitude ionosondes demonstrate that E‐CHAIM's errors appear to result from compounding same‐signed errors in its NmF2, hmF2, and topside thickness at these latitudes. Overall, E‐CHAIM exhibits strong performance within the polar cap and auroral oval but performs comparably to the IRI at sub‐auroral latitudes.

compared to the IRI while demonstrating more modest improvements over the IRI at CHAMP. These more modest improvements were, in that study, attributed to increased IRI performance in the near-peak topside due to an underestimation in IRI modeled ionospheric peak density (NmF2) acting against an overestimation in the thickness of the near-peak topside, rather than related to a particular loss in E-CHAIM performance . Maltseva and Nikitenko (2019) examined the performance of E-CHAIM in the representation of NmF2 in the Russian sector during a number of storms, where E-CHAIM performance appeared comparable to that of a GNSS TEC assimilation approach in reproducing nighttime enhancements in electron density associated with increased geomagnetic activity and capturing negative ionospheric storm responses. The recent Themens et al. (2020) study further diagnosed the performance of E-CHAIM in the representation of the short time scale variability in NmF2, concluding that E-CHAIM is able to capture ∼25% of the variability of the ionosphere at sub-monthly time scales, mainly through its representation of negative ionospheric storm responses. In order to get a better idea of the model's performance as a whole and to identify potential shortcomings in this performance, further validation with respect to observations, not incorporated into the model during development, must be undertaken.
For this purpose, in this study, we primarily evaluate the performance of E-CHAIM (version 2.0.0) with respect to ground-based Global Positioning System (GPS) measurements of Total Electron Content (TEC), the column integrated electron density of the ionosphere between the ground and the GPS orbit. We further pursue insight into the nature of this performance using Swarm Langmuir Probe (LP)-derived in situ electron density measurements (Knudsen et al., 2017;Lomidze et al., 2018) and Swarm conjunctions with mid-latitude ionosondes in the North American sector.
We begin this study by reproducing a previous validation of the IRI in the Canadian Arctic, namely Themens and Jayachandran (2016), to conduct a comprehensive comparison between IRI and E-CHAIM performance in this region. To this end, we make use of the same GPS TEC dataset that was used in that study and will employ many of the same comparison techniques to ensure that the E-CHAIM results presented here can be directly compared to the IRI results of Themens and Jayachandran (2016). This validation effort is conducted in Section 3 with a following discussion of the potential sources of model error in Section 5.
To further diagnose the behavior of E-CHAIM TEC performance we conduct a subsequent brief validation of E-CHAIM with respect to Swarm LP measurements in Section 4. The Swarm satellite altitudes of ∼450 and ∼500 km provide unique insight into the performance of the model in the near-peak topside, a region that is extremely sensitive to the interactions between the various sub-models that make up E-CHAIM's topside electron density (i.e.,hmF2,NmF2,and HTop). This is highlighted in previous attempts in Bilitza et al. (2012) to diagnose IRI deficiencies at these altitudes that were identified using GRACE and CHAMP in situ satellite measurements in Lühr and Xiong (2010). To help contextualize the performance level of E-CHAIM in comparison to Swarm, IRI comparisons will also be provided here; however, previous studies have examined the performance of the IRI using Swarm data (Lomidze et al., 2018). Discussion regarding the combined TEC/Swarm validations and their implications is undertaken in Section 5. Prior to conducting the aforementioned validations, the data used in this study are described in detail in Section 2.

Data
In this study we assess the performance of E-CHAIM as a TEC model using data from CHAIN GNSS receivers and diagnose the nature of E-CHAIM TEC errors in the near-peak topside region using measurements from the ESA Swarm constellation. These results are subsequently contextualized through comparisons to the IRI, with further diagnosis of their origin conducted with the assistance of ionosonde conjunctions with Swarm passes. In this section we first provide an overview of the relevant components of E-CHAIM before introducing the CHAIN and Swarm datasets used in this study. Following this, a brief overview of the IRI and the ionosonde measurements used in this study is provided.

The E-CHAIM Formulation
E-CHAIM's representations of the peak ionospheric density (NmF2) and peak height (hmF2) were first proposed in Themens, Jayachandran, Galkin, and Hall (2017), with Themens et al. (2018)  respectively. Functionally, E-CHAIM uses the F2-peak as an anchor point, from which it then models the vertical structure of the ionosphere using a semi-Epstein layer, similar to that of the NeQuick (Nava et al., 2008) but with height-varying scale thickness in both the topside and bottomside. In this manner, E-CHAIM's topside electron density is driven purely by sub-models of hmF2, NmF2, and topside scale thickness (HTop), while E-CHAIM's TEC includes contributions from all model components. A schematic representation of how these parameters affect the structure of the topside is presented in Themens et al. (2018) and a detailed discussion of the analytical behavior of E-CHAIM and the NeQuick's topside functions can be found in Pignalberi et al. (2020). The sub-models of NmF2, hmF2, and HTop were fit primarily using global ionosonde, radio occultation, topside sounder, and Incoherent Scatter Radar Data, where the hmF2 and NmF2 sub-models are actually composed of 24 separate models, one for each UTC hour. E-CHAIM's bottomside is represented by a series of layers in scale thickness, with an HBot parameter controlling the dominant variability of the bottomside and other sub-models adding on curvature associated with the F1-layer and E-Region. As we are here using E-CHAIM version 2.0.0, there is no auroral enhanced E-Region included in the model results presented here. The implications of this will be discussed in Section 3.

CHAIN
CHAIN has operated a dense network of GPS receivers and ionosondes in the Canadian Arctic since 2008, which now includes 25 scintillation monitor GNSS receivers and 9 ionosondes (Jayachandran et al., 2009). While CHAIN operates both GPS receivers and ionosondes, we shall here only examine the performance of E-CHAIM using a limited subset of this GPS receiver data and will not examine ionosonde data, as the ionosondes were previously used to test the regularization of the E-CHAIM model fit in Themens, Jayachandran, Galkin, and Hall (2017). The location of the subset of CHAIN GPS receivers used in this study is provided in Figure 1.
The geographic and geomagnetic coordinates of these CHAIN stations is provided in Table 1.
As this is the identical dataset to Themens and Jayachandran (2016), full details of the processing methods and calibration used for this dataset are outlined therein. To briefly summarize: 1. Data is gathered from the original 10 CHAIN GPS receiver sites using the CHAIN ftp linked from http:// chain.physics.unb.ca/chain/pages/data_download. 2. GPS receiver biases are calculated and removed using the revised Minimization of Standard Deviations (MSD) method of Themens et al. (2015), while satellite biases, derived by the Center for Orbit Determination in Europe (CODE), are gathered from the University of Bern ftp at http://ftp.aiub.unibe.ch/. 3. Vertical TEC (vTEC), a projection of line-of-sight TEC (sTEC) measurements, used to remove the geometric component of sTEC, is derived using the classical thin shell approximation with an assumed shell height of 400 km. 4. vTEC data from all satellite links are averaged together at each time step for comparison to modeled vTEC.

Swarm In Situ LP Data
Swarm is a constellation of satellites that includes an original three satellites (Swarm A, B, and C), as well as a later-adopted Swarm-E satellite (previously referred to as CASSIOPE) that is equipped with a complimentary instrument payload, referred to as the Enhanced Polar Outflow Probe (Yau et al., 2006). In this study, we will only make use of data from Swarm A and B, whose daily average orbit altitude above 40°N is plotted in Figure 2.
The Swarm satellites were launched into nearly polar, circular orbits at ∼87.5° inclination. Swarm A and C precess westward in local time at a rate of ∼2.7 h/month and Swarm B precesses away from A and C at a rate of ∼1.5 h/yr (Knudsen et al., 2017). Because of this slow precession of the orbit, we must be very careful in conducting model comparisons so as not to conflate seasonal and local time variations. Geomagnetic  Figure 3. For this figure, data has been aggregated in bins of 2.5° in geomagnetic latitude and 0.5 h in MLT.
From Figure 3 we see that, for Swarm A, there is a slight bias in the MLT data distribution over this period in favor of local midnight and local noon at lower latitudes, becoming a single maximum near local noon at the peak of the orbit. For Swarm B, at lower latitudes there are minor data nulls centered at roughly 4 and 16 MLT, while at the peak of the orbit there is a slight preference toward the afternoon and pre-midnight sectors.
The Swarm A, B, and C satellites each operate a pair of gold-and nitrated titanium-coated spherical Langmuir probes that allow for the in situ determination of plasma properties, such as electron density and temperature, at a 2 Hz sampling rate. Detailed information about these probes and how ionospheric characteristics are extracted from their measurements can be found in Knudsen et al. (2017) and Lomidze et al. (2018). Data from these probes was acquired from ftp://swarm-diss.eo.esa.int/Level1b/Entire_mission_ data/EFIx_LP/ as a Level 1B product, stored in CDF format. For the purpose of this study, the dataset has been decimated to 15 s time resolution instead of the native sampling rate, as we are not here interested in irregularities or very small-scale structures. The data distribution plots of Figure 3 were generated using this decimated dataset. Data marked as questionable using Flag_Ne values of 30 and 40 are discarded.
Since the launch of the Swarm satellites, an extensive validation and quality control effort has been undertaken to ensure the fidelity of the Swarm data products. As part of these efforts, Lomidze et al. (2018) demonstrated that the Swarm in situ electron density measurements require an 8%-11% enhancement (in critical frequency) to match Incoherent Scatter Radar (ISR), ionosonde, and COSMIC Radio Occultation (RO) measurements; as such, we here apply the calibrations of Lomidze et al. (2018) in our analysis.
For this study, we have chosen not to use Swarm C because of minor concerns regarding potential calibration errors at low density that may not yet be resolved (Lomidze et al., 2018) and because it follows an almost identical orbit to Swarm A and thus provides no additional value to this validation study. Similarly, Swarm-E has not been used because it does not include an instrument capable of measuring in situ electron density at this time.

International Reference Ionosphere
To contextualize the results of the observational validations of E-CHAIM, we also make use of IRI predictions of electron density and TEC using the latest version of the model (IRI-2016). The IRI is the defacto standard in ionospheric specification, recognized by the International Organization for Standardization (ISO) and is widely used by the ionospheric, geodetic, and radio propagation communities (Bilitza, 2018;Bilitza et al., 2011). As in Themens and Jayachandran (2016), we here use the IRI's URSI foF2 map option. Because we have access to a newer version of the IRI than was used in Themens and Jayachandran (2016), we have here opted to use the SHU-2015 hmF2 model option (Shubin, 2015), which should not affect the IRI's estimate of TEC to any significant degree. For bottomside thickness, the Bil-2000  (Coïsson et al., 2006). The index files for this model have been updated up to March 2020 using the files available at https://chain-new.chain-project.net/index.php/chaim/e-chaim/supplementary-support-software, which provides daily updates of the IRI's required solar and geomagnetic index files.

Ionosondes
To assist in the diagnosis of some of the observed differences between E-CHAIM and Swarm, which will be presented and discussed in the following sections, we will make use of ionosonde measurements from a subset of available systems in the North American mid latitude region, graphically represented in Figure 4.
We have here opted to use ARTIST v5 autoscaled ionosonde data, as the ionosonde comparisons are purely a statistical exploration that should not be severely impacted by potential scaling errors. To further reduce this risk, we have limited the ionosonde data to only that which has a quality score (CS) of 100 or greater (i.e., either 100 or manually scaled).

Validation Using CHAIN TEC
To begin our examination of E-CHAIM's performance as a TEC model, we first present examples of monthly average TEC to provide an impression of the model performance with respect to seasonal and solar cycle variability in Figure 5.
From Figure 5, one may note that E-CHAIM appears to perform quite well in the representation of monthly average vTEC within the polar cap and auroral oval but converges to a similar performance level as the IRI at sub-auroral latitudes (Edmonton and Sanikiluaq). One of the main outcomes from Themens and Jayachandran (2016) was the observed tendency for the IRI to fail to represent medium-timescale (month-tomonth) changes in the ionosphere associated with short-term changes in solar activity. This was highlighted via comparisons during sudden, 2-3 months enhancements in solar flux during the Fall of 2011 and Spring of 2014. Based on that study, E-CHAIM was developed with less smoothed solar activity drivers like 81day smoothed F10.7 flux and monthly ionospheric (IG) index instead of annually smoothed values. From   Figure 5, we may note that E-CHAIM does a good job in representing the enhancements in TEC during these events, even at sub-auroral latitudes where the enhancement is considerably more pronounced.
To provide a more quantitative metric of model performance, contours of the RMS errors in monthly diurnal median vTEC from E-CHAIM and the IRI are presented in Figure 6 with respect to Altitude Adjusted Corrected Geomagnetic (AACGM) latitude (Shepherd, 2014). The data represented in this figure are generated by first, for each month, calculating the median diurnal vTEC variation from GPS and the models and then determining the RMS differences between the models and observations in their representations of those monthly average diurnal variations. This is done for each station and then plotted together using the station AACGM coordinates. Note that white areas here, and in all later contour plots, represent data gaps where GPS data was unavailable.
Clearly, E-CHAIM performs better at high latitudes than sub-auroral regions, with RMS errors significantly decreasing with latitude irrespective of solar activity or season. At low solar activity, E-CHAIM's errors in the representation of median monthly diurnal variations range from 0.4 TECU at high latitudes to 3.0 TECU at sub-auroral latitudes, with greatest errors during the equinoxes at sub-auroral latitudes. At high solar activity, a similar pattern persists but with errors reaching as high as 5.0 TECU at high latitudes and 7.4 TECU at sub-auroral latitudes during equinox periods. This error behavior is in stark contrast to that of the IRI, presented in Figure 6b, which demonstrates errors that remain generally consistent regardless of latitude, with maxima during the equinoxes at all latitudes. For a full description and diagnostics of IRI performance using this data, the reader is directed to Themens and Jayachandran (2016). An overall impression of model performance is provided in Figure 7, where RMS errors calculated over all available times are presented against AACGM latitude.
While care should be taken when interpreting this summary figure, due to slight sampling differences between the stations, one can see a general trend of significantly improved E-CHAIM performance as one tends to high geomagnetic latitudes. On average, E-CHAIM appears to outperform the IRI by as much as 2.5 TECU in overall RMS error within the polar cap, while performing slightly worse than the IRI at North American sub-auroral latitudes. This can be further examined through Figure 6c, where we present the differences between the E-CHAIM and IRI monthly RMS TEC errors (i.e., the difference between Figures 6a  and 6b). In this figure, negative values correspond to improvement by E-CHAIM over the IRI, while positive values correspond to locations/periods where the IRI outperforms E-CHAIM.
From Figure 6c, we see that E-CHAIM outperforms the IRI at high latitudes, particularly during the equinoxes at high solar activity, and during the spring at all latitudes, reaching improvements of as much as 8.5 TECU. The IRI, however, outperforms E-CHAIM by as much as 2.4 TECU at sub-auroral latitudes during the winter, particularly at high solar activity. Interestingly, there appears to be comparable performance between both models in the auroral and polar cap regions during winter periods.
To further examine the performance of E-CHAIM at the CHAIN station locations, we present contour plots of the monthly diurnal median vTEC from CHAIN, the IRI, and E-CHAIM at Edmonton, Sanikiluaq, Iqaluit, and Resolute in Figure 8.
Beginning first with Resolute, in general, E-CHAIM does an excellent job at capturing the seasonal and diurnal variability of vTEC, with the exception of a minor tendency to dampen the semi-annual anomaly by slightly overestimating TEC during the summer daytime. E-CHAIM also tends to slightly underestimate TEC during the March 2014 solar activity enhancement; however, it does capture the existence of this enhancement and its diurnal and seasonal extent. The IRI does not capture this enhancement due to the model's use of 12-month smoothed solar activity proxy indices (Themens & Jayachandran, 2016). At Iqaluit, within the auroral oval, we again see good performance from E-CHAIM overall, but there appears to be a "bite-out" in TEC in the morning sector, comparable to the behavior of the IRI. In general, both models appear to produce the morning rise in TEC too late and the evening decline in TEC too early. Otherwise, E-CHAIM performs quite well with only a slight tendency to underestimate nighttime TEC. Some of this nighttime underestimation may be attributable to auroral precipitation, which was not included in this version of E-CHAIM and may represent a significant contribution to TEC in the nightside (Watson et al., 2021). At Sanikiluaq, we see some similar features to Iqaluit, with a persistent morning sector "bite-out" but good performance during daytime conditions; however, it appears that the nighttime underestimation of TEC is more persistent and severe. At Edmonton, we note particularly interesting behavior from E-CHAIM, where the daytime TEC appears inconsistent both seasonally and diurnally. This would suggest that there is likely a phase offset between the behavior of NmF2 and the topside thickness in E-CHAIM. Because E-CHAIM is made up of several completely independent models, it is possible that there exists a physical inconsistency between one and more of these model components. Since NmF2 and topside thickness are the dominant controllers of TEC in E-CHAIM, we suggest that an inconsistency could exist between these two parameters.
Noting that topside thickness exhibits diurnal behavior peaking in the morning and evening (Themens et al., 2018) and NmF2 exhibits diurnal behavior dominated by solar zenith angle, it is possible that even a very slight mismatch in the timing of these maxima can create "patchy" diurnal variability in the resulting TEC. This "patchy" behavior is not seen in any of the E-CHAIM parameters on their own. This is further complicated by Edmonton's location within the Main Ionospheric Trough (MIT). Overall, E-CHAIM performs very well within the polar cap and auroral oval but exhibits underestimation at nighttime and in the morning sector at sub-auroral latitudes. It is always possible that some of this mid-latitude underestimation is caused by unaccounted for plasmaspheric electron density contaminating vTEC from southward GPS ray paths. To abate this concern, we have employed the use of the Gallagher plasmaspheric model (Gallagher et al., 1988). Using this model, we have calculated the potential contribution of plasmaspheric electron density to the measured sTEC by integrating the plasmaspheric density from the model above 2,000 km altitude along the GPS ray paths for Edmonton between January 2000 and January 2007. We have then projected these plasmaspheric sTECs using the same projection function as was done for the measured sTEC. This was done to reproduce any effect wrongful projection might have caused. We have then averaged the resulting projected plasmaspheric vTEC contributions over all satellites in view at each instant in time. The resulting plasmaspheric contribution to the measured vTEC at our lower-most latitude site (Edmonton) is provided in Figure 9.
A more likely possibility is that E-CHAIM is overestimating the depth of the MIT or missing another important source of TEC. Given that there were no ionosondes available for the fitting of E-CHAIM in the vicinity of Edmonton and the lower data availability in the MIT as a whole, both because of sparse ionosonde operation and the observational tendency for ionosondes to be incapable of observing very low electron densities within the trough, it is possible that this underestimation is related to the representation of the MIT (particularly NmF2), rather than plasmaspheric contamination. It is also possible that auroral precipitation-enhanced E-Region densities, which are not accounted for in either the IRI or E-CHAIM v2.0.0, could form a non-negligible contribution to nighttime TEC at these sites. Further study is needed to tease Regardless, as we are not trying to conduct a quantitative analysis and are instead only interested in whether plasmaspheric electron density could account for the observed model-GNSS TEC differences, simply illustrating the relative magnitude of plasmaspheric electron density's contribution should be sufficient here. We see here that the average plasmaspheric contribution to vTEC at Edmonton ranges from less than 0.5 TECU at solar minimum to just under 1.5 TECU at solar maximum. This is simply insufficient to account for the observed average differences, shown in Figure 5, of 2 TECU at solar minimum and up to 5 TECU at solar maximum. Based on these results, it is unlikely that plasmaspheric TEC, by virtue of its very low contribution to vTEC at these locations, can account for the observed model-data differences, at least not in their entirety. out which contribution could be the largest at play, which will be explored in a later study after the inclusion of a particle precipitation module in E-CHAIM.
The above explanation can account for observed errors during nighttime periods; however, E-CHAIM also demonstrates underestimation of TEC at sub-auroral latitudes around local noon. To identify from where exactly these errors could originate in the model, we further compare the model to Swarm observations in the following section.

Diagnosing Errors With Swarm In Situ Measurements
In order to further diagnose the above behavior, we will here make use of Swarm in situ observations. These observations allow us to better examine the behavior of the model spatially and tease apart some of the sources for observed errors.

Swarm Magnetic Latitude and Local Time
To avoid conflating seasonal and local time variability, due to the slow precession of the Swarm orbit, we examine the seasonal-MLT variability of measured and modeled electron density at Swarm orbit in AACGM latitude bins of 50-60, 60-70, and 70-80 MLat in Figures 10 and 11. White pixels in these plots represent missing data due to the precession of the Swarm satellite orbit.
In the 50-60 MLat bin, we note that E-CHAIM tends to underestimate electron density in the morning and daytime sectors. The IRI appears to perform quite well at these latitudes with the exception of a tendency to overestimate electron density near magnetic noon during the winter and equinoxes.
Qualitatively, both models represent the relative MLT structuring observed in the Swarm A and Swarm B data and do a decent job at reproducing electron density at these orbits, except for a clear underestimation of electron density at the highest latitudes by the IRI. To get a better impression of the absolute performance of these models, however, we present the absolute model-data differences for both models and satellites in Figure 12.
In the 50-60 MLat bin, at both Swarm A and B, E-CHAIM demonstrates a general tendency toward underestimation of electron density, particularly during summer and equinox daytime periods; however, the IRI tends to overestimate electron density around MLT noon, mainly during the winter, and to underestimate electron density in the morning and evening sectors during the summer and equinoxes. This is consistent with the observed TEC behavior discussed in Section 3, where the IRI produces a compressed daytime electron density maximum and E-CHAIM underestimates TEC at sub-auroral latitudes. The seasonal behavior of both models is, similarly, highly consistent with the TEC observations of Section 3.
In the 60-70 MLat bin, both models appear to produce many of the same local time and seasonal structures and, as such, produce similar error tendencies. Both models tend to underestimate electron density in the morning and evening sector, particularly at high solar activity, with E-CHAIM's pattern of underestimation extending more into local noon. Comparing the absolute performance of both models, E-CHAIM underestimates electron density slightly less than the IRI in the morning and evening sector, with clearer performance differences at Swarm B than at Swarm A.
The largest differences between E-CHAIM and IRI performance appear in the highest MLat bin (70-80 MLat), where the IRI severely and consistently underestimates electron density, almost universally. E-CHAIM also underestimates electron density, to a lesser extent, at MLT noon and midnight, with sporadic minor overestimation in the morning and evening sectors, but generally exhibits reduced overall error compared to the IRI.
We also see increased underestimation in all latitude sectors by each model during the Spring of 2014 enhancement seen in the TEC data earlier in this study. At Swarm, we note that E-CHAIM appears to capture this enhancement much better than the IRI in the polar cap but, otherwise, both models underestimate electron density between 12 and 18 MLT during the Spring of 2014.
In general, the performance of E-CHAIM relative to the IRI appears to increase as one tends to higher altitudes (e.g., between Swarm A and Swarm B). This is consistent with the results of , which showed that E-CHAIM significantly outperforms the IRI in the upper topside but only marginally outperforms the IRI in the near-peak topside. As discussed in Themens et al. (2018) and Themens, Jayachandran,andMcCaffrey(2019), at high latitudes the IRI has a tendency in the near-peak topside for errors in the curvature of its topside function to work against other errors in the model (e.g., NmF2 underestimation), resulting in better than expected performance in the near-peak topside but worse performance as one tends to higher altitudes. As E-CHAIM is constructed in a similar manner as the IRI, in that it is a peak-referenced model, it is highly possible that interactions between the different component models of E-CHAIM could be resulting in greater errors in the near-peak topside, and in TEC, than one might expect given the known good performance of individual model components. In this way, relatively small errors in any given E-CHAIM component could interact in such a way that the overall electron density may be more severely underestimated.

Ionosonde Conjunctions
To assess the degree to which this type of degenerate interaction could be contributing to the observed underestimation of TEC and Swarm electron density at sub-auroral latitudes in the North American sector, we have gathered data from four ionosondes in the United States and ingested the ionosonde-derived hmF2 and NmF2 into E-CHAIM, a feature available in the IDL version of the model. This ionosonde-assisted E-CHAIM electron density is then compared to that measured by Swarm A, the satellite that demonstrated the largest model-data errors. Conjunctions are selected in this case to be any measurements made within 7.5 min of one-another, within 0.25° in latitude and 0.5° in longitude. In Figure 13 we present E-CHAIM Figure 12. Differences between measured and modeled electron density for E-CHAIM and the IRI with respect to Swarm A (left) and Swarm B (right) for the 50-60 (top), 60-70 (middle), and 70-80 (bottom) geomagnetic latitude bins. Blue implies that the model underestimates measured electron density while red implies overestimation. Gray areas mark periods/MLT sectors with no Swarm observations. electron density with and without ionosonde data ingestion for all available conjunctions between 2014 and the end of 2017.
In Figure 13, we see a consistent tendency for ingestion of just hmF2, ingestion of just NmF2, or ingestion of both hmF2 and NmF2 to result in a systematic improvement in the modeled electron density. Prior to ingestion, at each ionosonde, one sees a pattern of underestimation of Swarm electron density by E-CHAIM. By ingesting ionosonde measurements, this underestimation tendency is consistently improved. Given that E-CHAIM's topside is completely defined by hmF2, NmF2, and HTop, it is presumable, that the remaining errors after both hmF2 and NmF2 ingestion are the result of either an underestimation of HTop or an issue in the shape of the model's topside. Interestingly, one should note that these conjunction results suggest that these errors from each model component are that of underestimation, suggesting that the errors in each of the hmF2, NmF2, and HTop models are constructively adding together to result in a more severe underestimation of electron density at a given altitude in the near-peak topside ionosphere at North American sub-auroral latitudes. Unfortunately, without a global assessment of the model topside with coincident measurements of NmF2 and hmF2, it is not possible at this time to diagnose what specific elements of the hmF2, NmF2, or HTop models could be most at blame for the observed errors.

Discussion
The results presented herein highlight both regions of strong and weak performance by E-CHAIM. While E-CHAIM performs exceptionally well in the auroral zone and polar cap, it suffers from a tendency to underestimate TEC and near-peak topside electron density at sub-auroral latitudes. The cause of this issue could be a culmination of two possibilities: dataset and parameterization limitations.
In terms of the E-CHAIM parameterization itself, TEC and topside electron density are mainly influenced by two separate models: one for NmF2, and one for the topside thickness. The shortcomings in E-CHAIM's TEC and electron density representation at sub-auroral latitudes could result from a corresponding error in one of these models or through the interactions of errors in these models. In Section 4.2, we demonstrated that, at least at American ionosonde locations, what are likely small underestimations of NmF2, hmF2, and HTop can result in large combined effects on the electron density in the near-peak topside and as such it is unlikely that a single problem can be addressed to remedy the observed errors.
It has been recently discovered that GNSS Radio Occultation (RO) data from between 45°N geomagnetic latitude and 60°N geographic latitude was unintentionally excluded from the fitting dataset of the E-CHAIM NmF2 and hmF2 models. Given the significant lack of ionosonde measurements in North America within that region, the failure to include the RO measurements could have significant implications for the performance of the model in that region and may account for some the observed anomalies. The exclusion of this dataset appears to have simply been an oversight when the planned domain of the model was changed from the originally planned 60°N geographic latitude lower boundary to the current 50°N geomagnetic latitude lower boundary during model development. One should note that the E-CHAIM topside model uses the full RO dataset, down to 45°N geomagnetic latitude, and is not subject to this problem, as the topside dataset was gathered and processed separately when the topside model was developed (Themens et al., 2018).
Furthermore, the fitting dataset for the E-CHAIM topside model includes a strong representation within the auroral zone and polar cap, which included several ISRs in Tromso, Svalbard, Resolute, Poker Flat, Sondrestrom, and Malvern (Themens et al., 2018), while sub-auroral latitudes only had ISR data contributed from the Millstone Hill and Kharkiv ISRs, both of which are within the E-CHAIM fitting domain but below the recommended lower magnetic latitude boundary of the model. While RO data covered the entire E-CHAIM domain, the ISR data form the dominant portion of the E-CHAIM topside model dataset. Systematic erroneous behavior in this region would suggest that either the RO data, upon which this region's fitting relies, is subject to systematic errors in this region or that the limited amount of RO data was not able to provide sufficient weight in the fitting to adequately represent this region. As this region includes the MIT, which produces strong horizontal gradients that are known to compromise the RO technique (Shaikh et al., 2018;Yue et al., 2010), it is most definitely possible that there are errors in the shape of RO profiles in this region; however, at the moment there exist no studies that have characterized the impact of MIT horizontal structuring on Abel-inverted RO profile shape, despite some studies having used this data to study the trough (Lee et al., 2011).
This challenge highlights one of the largest hindrances in using a peak-referenced parameterization to represent the electron density profile of the ionosphere. In such a parameterization, each individual component of the parameterization may, possibly, only exhibit very small errors; however, when combined together the interactions of these errors can produce anomalous behavior in absolute electron density. To handle this to some extent, models like the IRI and NeQuick (Nava et al., 2008) model their vertical parameters using inter-related components. For example, the topside thickness in those models is designed as a function of other model parameters, like hmF2 and NmF2. This type of approach ensures that the behavior between model components is relatively consistent; however, this approach precludes the possibility of behaviors that cannot be represented as functions of other model parameters. As presented in Themens et al. (2018), the topside thickness, in particular, exhibits local time behavior that cannot be represented as a function of hmF2 and NmF2 since it has phase elements orthogonal to those of hmF2 and NmF2. Overall, we are left with a situation where, in these peak-referenced models, we either have anomalies due to compounding errors from individual model components or cannot represent physical behavior because of a need to tie the model components together in unphysical ways. Empirical modelers will need to explore new approaches to handling issues like these in the future, perhaps through innovating new empirical 10.1029/2021SW002872 18 of 20 approaches to simultaneously 4D model the ionospheric state through machine learning techniques (Li et al., 2021). Despite these issues, models like these still significantly outperform most competitors (Shim et al., 2011(Shim et al., , 2018. Another large challenge for these models in the representation of the sub-auroral ionosphere is in capturing the MIT and its behavior. The MIT is a complex and dynamic high latitude structure that is challenging to represent empirically with conventional ionospheric datasets. For example, the trough can be as thin as 5°-7° in latitude (Aa et al., 2020), which would require ionosonde observations at these station densities in order to properly constrain its structure in an empirical model. Even if this station density was available, in the case of the spherical cap harmonics used by E-CHAIM, we would need to increase the number of harmonics to 9° on a 45° spherical cap in order to resolve those spatial scales, which would cause the model fit to become unstable in regions without observations, such as over the oceans. To mitigate this type of issue, Deminov and Shubin (2018) proposed the use of parameterizations specific to the MIT as a means of including this structure in empirical models without having to increase resolutions globally. We feel that, based on the present results, a similar parameterization may be well warranted and could, in fact, allow for better representation of not only the trough itself, but also its dynamics during geomagnetic storms; however, the traditional choice to empirically model electron density using the F2-peak as an anchor and simplified shape functions for vertical structure will make it challenging to expand this type of approach to properly model features in the trough's vertical structure, such as the trough's vertical tilt (Jones et al., 1997), and may warrant further innovation.
We look forward to expanded observational capacity in central Canada to help further diagnose the nature of these challenges and develop mitigation strategies. In particular, new techniques for measuring F2 peak density from SuperDARN, which has substantial coverage over central Canada and high latitudes, may provide new opportunities to improve E-CHAIM in sub-auroral regions (Bland et al., 2014;Koustov et al., 2020;Ponomarenko et al., 2011).

Conclusions
We have here examined the performance of E-CHAIM in the representation of TEC at high latitudes within the Canadian sector between 2009 and 2015. Within the polar cap, E-CHAIM demonstrates monthly RMS vTEC errors as low 0.4 TECU during solar minimum summer but as high as 5.0 TECU during solar maximum equinox conditions. These errors represent an improvement of up to 8.5 TECU over the errors of the IRI in the same region. At sub-auroral latitudes, E-CHAIM errors range between 1.0 and 7.4 TECU, with greatest errors during the equinoxes at high solar activity. In comparison to the IRI, these errors constitute a slight (up to 0.5 TECU) improvement over the IRI during summer periods but worse performance during winter periods by up to 2.4 TECU at high solar activity. In contrast to the IRI's tendency for latitudinally consistent TEC errors, E-CHAIM errors in vTEC vary significantly with magnetic latitude, where E-CHAIM errors are lowest in the polar cap and increase as one tends to lower latitudes.
To further examine the nature and causes of E-CHAIM's TEC error behavior, we also make use of Swarm A and B observations of in situ electron density in the near-peak topside ionosphere. From these observations, we note that E-CHAIM's performance degrades as one tends to lower magnetic latitudes, consistent with the GPS observations. E-CHAIM generally underestimates electron density near local noon, particularly at sub-auroral latitudes, by as much as 1.0 × 10 11 e/m 3 at solar maximum. The IRI tends to overestimate electron density near local noon by up to 1.0 × 10 11 e/m 3 at sub-auroral latitudes during solar maximum but also shows a consistent tendency toward underestimation of electron density at Swarm within the polar cap at all local times, again by as much as 1.0 × 10 11 e/m 3 . Consistent with Themens, Jayachandran, and McCaffrey (2019), E-CHAIM performance improves with increasing altitude.
Comparisons between E-CHAIM and Swarm with ingested ionosonde observations, we have found that E-CHAIM's various component models, hmF2, NmF2, and HTop, each contribute to the observed underestimation tendency of E-CHAIM at sub-auroral latitudes, where small errors in any individual component can constructively add up to cause larger proportional errors in the near-peak topside and in TEC.
It should also be noted that the TEC validation conducted here spans 2009-2015, which is within the time period of data used to fit E-CHAIM. While Swarm comparisons do not demonstrate any changes at the boundary between the fitted and forecasted periods (i.e., before and after January 2016), further validation with TEC and other datasets should be conducted after 2016 to ensure the forecast capacity of the model beyond its fitting period. Such comparisons would be particularly valuable in the upcoming solar maximum period.

Data Availability Statement
The source code for E-CHAIM in the C, Matlab, and IDL languages is currently openly available online at https://e-chaim.chain-project.net. This study uses version 2.0.0 of E-CHAIM. Infrastructure funding for CHAIN was provided by the Canadian Foundation for Innovation and the New Brunswick Innovation Foundation. Science funding is provided by the Natural Sciences and Engineering Research Council of Canada. The Swarm Extended LP data set used in the study is available at ftp://swarm-diss.eo.esa.int/Level1b/ Entire_mission_data/EFIx_LP/. CHAIN data can be acquired from the network's ftp, with instructions provided here http://chain.physics.unb.ca/chain/pages/data_download. The rules of the road for the use of CHAIN data are provided here http://chain.physics.unb.ca/chain/pages/rules.