So you see that 50% of break-thru cases appear in 15 days after 1st shot. But the damage takes more time after 2nd shot, the months 5,6,7 are peaking as breakthru cases.
This guy has a snapshot two months earlier, on nov 4th
If I understand your comment correctly, you are saying the model could be improved if it could incorporate lagged and persistent effects of the vaccine, and separate out effects of the first dose vs. the second dose? If so, I agree, but there are some challenges/limitations to implementing these changes including multicollinearity etc.
Dr. Spiro P. Pantazatos here is the problem with your approach to Columbia University. You need to lead with a threat. Tell them that this shot is poison (you can prove it) and you will see that everyone involved in mandating the shot will be financially and legally held liable for each and every student harmed. Any appeal to morals and 'the right thing to do' is wasted on the sociopaths that find themselves in leadership positions when it comes to making money. Read 'The Allure of Toxic Leadership' by Jean Lipman-Blumen.
Here is an example of presenting consequences versus appeal to morality.
Many decades ago the family community of Rochester, Minnesota was about to get its first strip club. The city council viewed it as a cash cow when it came to taxes. The community was repulsed by the idea and the type of people it would attract. At the city council moral pleas and fears of what a strip club would do to the community were met with yawns and annoyed council members. It wasn't until someone pointed out the strip club was going to be built next to a busy highway and needed a walk-bridge over that busy highway so inebriated customers wouldn't die crossing heavy traffic. This point was followed up with a threat to see each council member held responsible for those deaths.
The result was the strip club did not want to pay for a walk-bridge. Hence, the strip club was never built.
I greatly admire your skills in both statistical analysis and math. Very impressive charts and graphs -- which means absolutely nothing to someone like me who is, to say the least, arithmetically challenged. Is it possible to express all this in words of three syllables or less? We, the dunces of the world, would appreciate it.
Beyond my ken also. Easier to comprehend is Ed Dowds book, "Cause Unknown" where he proceeds to list hundreds of healthy young people who dropped dead after their covid vaccine.
How much variation did vaccines explain when you didn't allow each COVID wave to have a different term in your model?
Your 2015-2019 average baseline exaggerates excess deaths in 2021 and 2022 relative to 2020. Part of your variation explained by vaccines might actually be due to your inaccurate baseline, because your baseline produces superfluous excess deaths in 2021 and 2022 that happen to partially coincide with vaccination waves. And because your model has a different term for each COVID wave, it allows the weight of COVID waves in 2021 and 2022 to be reduced in order to accommodate a higher weight to vaccines.
In your first plot which shows excess deaths in the CDC dataset, there's no week where the excess mortality is even close to zero after the first few weeks of 2020. However at Mortality Watch if you plot ASMR with a 2010-2019 linear baseline, there's even a few weeks with negative excess mortality in March and April of 2022: https://www.mortality.watch/explorer/?c=USA&ct=weekly&df=2020%2520W01&bm=lin_reg.
You wrote that the CDC dataset had a total of 1,743,770 excess deaths in 2020-2022. When I downloaded the CDC dataset, I got the same result for MMWR weeks in 2020 and 2022 as a whole when I looked at the column "Number above average (unweighted)". I got 585,409 excess deaths on MMWR weeks in the year 2020, 670,667 in 2021, and 487,694 in 2022:
t[Sex=="All Sexes"&RaceEthnicity=="All Race/Ethnicity Groups"&AgeGroup=="All Ages",sum(`Number above average (unweighted)`),MMWRyear]
However when I used my own more accurate method to calculate excess deaths where I multiplied the 2010-2019 linear trend in CMR for each age by the mid-year resident population estimates of the age, I got only about 1.27 million excess deaths in 2020-2022: sars2.net/rootclaim.html#Table_of_excess_deaths_by_cause. I got about 468,885 excess deaths in 2020, 515,125 in 2021, and 285,019 in 2022. So the CDC dataset had about 117,000 more excess deaths in 2020, 156,000 in 2021, and 203,000 in 2022, so the CDC dataset exaggerated excess deaths each year but it was particularly bad in 2022.
When individual COVID waves are not modeled, the doses term is not significant (p=0.16) and it explains only an additional 0.4% variance. Omitted variable bias may be masking the effect of the doses in this model. You raise a valid point about the baseline not taking into account any pre-pandemic yearly trends. In terms of your next sentence, "And because your model has a different term for each COVID wave, it allows the weight of COVID waves in 2021 and 2022 to be reduced in order to accommodate a higher weight to vaccines." is a bit less clear. When the doses term is added, I see the w5 and w6 have lower weights, but the w7 and w8 have higher weights. If the vaccines are a better fit to excess deaths then I would expect the individual COVID wave weights to change a bit. I wasn't able to easily locate your (more conservative and probably accurate) excess death calculations in your second link, but I did see another CDC spreadsheet that appear to correct for yearly (and seasonal) trends (https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm). Do you have any comments on the methods the CDC uses for this spreadsheet in relation to your method for calculating the excess deaths?
On the website by CDC you linked, the dataset under "Download Data > CSV Format > National and State Estimates of Excess Deaths" has about 1.31 million excess deaths on weeks ending in 2020-2022, so it seems to be more accurate than the CDC dataset you used which had about 1.74 million excess deaths on MMWR years 2020-2022:
The code above shows that the excess deaths were nearly the same using the weighted and unweighted figures, because the weighting was used to impute deaths that were missing due to a registration delay, but the last version of the CDC dataset was published in 2023 when only a few deaths were still missing in 2020-2022 because of a registration delay.
But anyway there's still something weird even with CDC's more sophisticated method of calculating the baseline, because on CDC's excess_deaths.htm if you look at the plot titled "Weekly number of deaths (from all causes)", the actual deaths are below the baseline on almost every week of 2018 and 2019: https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm. But on the other hand the plot seems correct in the respect that it has weeks with negative deaths in March and April 2022.
But in either case it would've probably been better to use one of the other datasets by the CDC that used the more sophisticated method to calculate the baseline rather than the dataset which used a simple 2015-2019 average baseline.
The dataset you used was titled "AH Excess Deaths by Sex, Age, and Race and Hispanic Origin". I didn't even find it linked on CDC's main excess_deaths.html page but only here: https://www.cdc.gov/nchs/covid19/covid-19-mortality-data-files.htm. A note on the same page says: "AH = Ad-hoc. Datasets with the prefix AH are not updated routinely, but can be updated upon request."
I think I figured out why you didn't include a model where you compared cases plus vaccines against excess deaths without allowing each COVID wave to have its own term, which is that it gave you a negative coefficient for vaccines. Or at least that's what happened to me when I tried to reproduce your models.
And when I compared COVID deaths plus vaccines against excess deaths, it also gave me a negative coefficient for vaccines, regardless of whether I took the excess deaths from the same ad-hoc CDC dataset you used or I used my own more accurate method of calculating excess deaths.
If you would've done the models directly based on COVID deaths instead of cases, you wouldn't even have needed a separate term for each COVID wave, because it would've taken into account how different COVID waves had different CFR values. If your goal was to evaluate the contribution of vaccines to excess deaths beyond the deaths attributed to COVID, then it seemed like a weird approach to do the model indirectly based on COVID cases rather than directly based on COVID deaths.
So did you in fact also do models based on COVID deaths, but you didn't publish them because they gave you a negative coefficient for vaccines?
Thanks, 5 of my weeks were off by 2 and 1 week was off by 1.
I'm now getting an r^2 of about 0.780 without the vaccine term and 0.789 with the vaccine term. So I still don't understand what I'm doing different from your calculation:
Thanks for double checking and the replication attempt. I was not as precise as you with the lag (I did not apply an 8-day lag to the daily cases prior to down sampling to weekly resolution which seems a better approach), plus there may be an error in my code which I will double check tomorrow. The OWID website says this about the daily case count: “In addition, there is a delay between testing, confirming, and reporting a case to international organizations. This means the numbers do not necessarily reflect the number of cases on the specific date.” but I don’t see any more details about how long of a delay there is. What happens if you don’t apply any lag to the cases, or just apply 3 or 4 day lag? I’ll debug some more on my end and be back in touch shortly.
I now used these as the starting weeks of each wave after I had shifted the cases 8 days forwards: 2020 week 1, 2020 week 24, 2020 week 39, 2021 week 12, 2021 week 27, 2021 week 46, 2022 week 16, and 2022 week 45.
So I don't know what I did different from your calculation. This time I even took data for cases from the OWID dataset "Daily new confirmed COVID-19 cases per million people", which showed a moving average of cases per capita with a daily precision, even though earlier I took the weekly number of cases from the new_cases column in the file owid-covid-data.csv.
After having slept on it, the 2015-2019 average baseline is not necessarily less “accurate”: it is reporting excess deaths relative to a 2015-2019 averaged baseline. If I understand correctly, your approach and the more sophisticated CDC adjustment assumes that mortality rates were on an upward trend from 2015 through 2019, that this same upward trend would have continued in years 2020-2022, and that this upward trend has nothing to do with the excess deaths attributable to the pandemic. The first assumption appears questionable in light of a virtually flat age-adjusted death rate from 2013 to 2018 (see https://www.cdc.gov/nchs/data-visualization/mortality-trends/index.htm) . When I squint the line even suggests a slight downward trend from 2017-2018. Moreover, when you look at the first graph of the post (excess deaths across time), the yearly linear trend appears to decrease from 2020 to 2022 based on the “troughs” being lower in each subsequent year. Also, I wouldn’t expect the excess deaths to ever go to zero or be negative in this period given that US COVID cases were always positive from the beginning of the first wave in 2020 through the end of 2022. Even if we assume the yearly trend in excess deaths was increasing from 2013 through 2019, and even if we assume it would have increased at the same rate in 2020-2022 in the absence of the pandemic, I don't think it doesn’t make sense to try and “remove” that trend because the same reasons that would have caused mortality rates to increase steadily in 2013-2019 (i.e. increasing obesity, chronic disease etc.) would be the same factors that exacerbate and contribute to the pandemic excess deaths (i.e. COVID comorbidities).
Regarding your interesting alternative approach of using COVID deaths, instead of COVID cases, as a predictor of all-cause excess deaths: first, I would not trust the accuracy of a COVID deaths regressor given the varying definitions of ‘COVID deaths’ (i.e. dying with, or of COVID) and variability in physicians’ thresholds (and possible hospital incentives), across time and between states and hospitals, for adjudicating a ‘COVID death’; second, given the increased risk of both COVID infection and death in the first weeks post-injection (see e.g. https://metatron.substack.com/p/alberta-just-inadvertently-confessed ), the COVID death regressor may actual include a substantial number of vaccine-caused deaths, and would result in model misspecification to the extent that vaccine deaths are conflated with COVID deaths, and third, it is counterintuitive (to me at least) to use one type of death (which is a subset of the main outcome variable) to predict all-cause deaths and is not typical of epidemiological time series models that aim to measure the effects of environmental *exposures* that may *contribute* excess deaths. Compared to COVID deaths, COVID cases were measured relatively consistently throughout the pandemic, and while they did not accurately reflect the true infection rate, they consistently undershot the actual infection rate and this underascertainment bias doesn’t really affect in the model because it’s the shape of the wave that matters and each wave is scaled anyways to fit the outcome variable.
In response to your last question, no, I did not also do models with COVID deaths. I did not previously think of this as a modeling approach until you mentioned it. I think it is an interesting approach to consider. However, for the above reasons, I think modeling COVID cases is a better modeling strategy.
You said there is an elevated risk of COVID death in the first weeks following vaccination. But in the English ONS data during each of the first four months of 2021, unvaccinated people had higher COVID ASMR than people who had received the first dose less than 21 days ago. During other months the number of deaths in the group "First dose, less than 21 days ago" was so small that the COVID ASMR was not listed. Data for the first three months of 2021 was excluded from the last two editions of the dataset which were based on the 2021 census, but it's still available from the earlier editions: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/deathsinvolvingcovid19byvaccinationstatusengland/deathsoccurringbetween1january2021and31may2022.
Yes you are right that raw excess deaths will increase if there are more older people in the population. But age is also a significant predictor of COVID death so I'm still not clear on the motivation behind removing the linear trend for the baseline. At any rate, I meant to say that both infection risk and risk of death (from side effects such as myocarditis/heart attack or stroke which could also be classified as "COVID deaths" if they also tested positive for COVID) increase in the first few weeks post-injection. I'm not too familiar with the ONS data and their methods, but when I took a first pass at the ONS data that you sent (I clicked on "Deaths by vaccination status, England" in Section 2 of the link you sent, and then downloaded and opened Table 1, Column G of the XLS dataset), it actually shows a consistently *higher* all-cause ASMR for "First dose, less than 21 days ago" and "First dose, at least 21 days ago" vs. "Unvaccinated" beginning in June, 2021 onwards, and especially in 2022. When I look at "Deaths involving COVID-19", I also see many months where "First dose, at least 21 days ago" has higher ASMR than "Unvaccinated" from May 2021 onward, but what doesn't make sense to me is why the "First dose, less than 21 days ago" has so few counts while the "First dose, at least 21 days ago" has many more counts in the same month...does this make any sense to you? If enough time passed so that there were enough people that died from more than 21 days since their 1st dose, doesn't that mean there should also be a fair number of deaths among people less than 21 days since the first dose? Or am I missing something?
After the first half of 2021 there were no longer many new people getting vaccinated and especially not among the age groups that have the highest risk of death from COVID, so there's obviously not that many deaths under "First dose, less than 21 days ago". But the group "First dose, at least 21 days ago" even includes people who got vaccinated a year ago. There's also a reduced number of all-cause deaths in the first 3 weeks after vaccination, so even after adjusting for observation time, the group "First dose, less than 21 days ago" has much lower all-cause ASMR than the group "First dose, at least 21 days ago".
And also after the second dose was rolled out, the ASMR of the group "First dose, at least 21 days ago" shot up because the healthy vaccinees moved under the second dose but the so-called "unhealthy stragglers" remained under the first dose. A similar phenomenon can also be seen in the Czech record-level data: sars2.net/czech.html#Plot_for_ASMR_by_dose_and_date.
The reason why the group "First dose, less than 21 days ago" has higher all-cause ASMR than unvaccinated people after early 2021 might be because of a phenomenon I'm calling the "late vaccinee effect", where people who got vaccinated late after the main rollout wave was over had higher mortality than people who got vaccinated during the main rollout wave.
A similar phenomenon exists in the record-level datasets from the Czech Republic and Connecticut and in the UKHSA FOIA data that was given to Clare Craig in May 2024:
One explanation for the late vaccinee effect might be if the late vaccinees are less conscientious than people who got vaccinated on time so they might also have poorer health, or another explanation might be if the late vaccinees include people who had some kind of a sickness during the main rollout so they had to delay vaccination.
t[Umrti!=""&Umrti==Datum_Prvni_davka,.N] # 13 (COVID deaths on week of first dose)
Considering that many people got vaccinated during the COVID wave in early 2021, there's actually surprisingly few COVID deaths on the week of the first dose. (But of course the average observation period here doesn't consist of a full 7 days, because if for example someone got vaccinated on Friday at midday, then they had only 2.5 days left to die on the same week when they got vaccinated. But assuming that people usually got vaccinated during weekdays and during working hours, the average exposure time might be around 4.5 days.)
But actually upon further thought, maybe it's not surprising that there's a low number of COVID deaths on the week of vaccination, because people normally didn't get vaccinated immediately after they had been diagnosed with COVID, and people usually took at least a few days after a diagnosis to die from COVID.
Why You Can't Use Czech COVID-19 Data for Death Rate Studies Fri Sep 19 14:14:48 2025
Intro
Over the last few years, two big sets of info from the Czech Republic have been going around in discussions about COVID-19 death rates and vaccine safety. Some folks have used these files, called the old data and new data, like they're official death records from the government. But really, you can't trust either of these sets of data for looking at death rates. They both have big problems with where the data came from and how it's put together, so they're not good for figuring out death rates for the whole population or how vaccines work.
The New Data
This Czech data is supposed to be about everyone in the country, but where it came from tells a different story. Instead of being from the official census or population records, it's mostly from health insurance records. Basically, it's from the National Register of Reimbursed Health Services (NRHZS), plus records of diseases and vaccines. This might have info on shots, but it's not the same as official death data from the census. Using insurance data messes things up a lot. Insurance can include people who don't live there permanently, like foreign workers, visitors, or temporary residents. When you mix them into data that's supposed to be about the whole Czech population, you get inflated numbers and tons of records that are questionable. Instead of showing the real population, the data becomes a mix of citizens, residents, and insured folks just passing through. Also, it looks like the insurance data was mixed with census or registry files in a messy way. It seems like they just did a cartesian join combined everything without being careful, which makes records multiply and creates duplicates. That's why the data has significantly more records than the actual Czech population. Mixing like this might keep the vaccine info, but it ruins the data for death rate stats. If you treat this data like it's a proper population count, you're going to be wrong. The worst part is that you can't even find and get rid of the bad records.
The Old Data
The old data isn't any better. People say it's from the Institute of Health Information and Statistics of the Czech Republic (ÚZIS), but it's not clear where the data really came from. It looks like ÚZIS started with census data but then had to add in the insurance files for vaccine details. Again, the mixing and merging messes the files up. So, this file is also partly from insurance, and it has the same issues as the new data like uncertain population numbers, possible duplicates, and non-residents. Even without knowing exactly where it came from, there are weird things in the file like strange death counts and problems across different groups of people. This makes it clear that you can't trust it for death rate studies.
Why Neither Set of Data Works for Death Rate Studies
The main issue is that people think these files are like official population records, but they're not. Census offices don't gather vaccine info. To make these files, health insurance records had to be combined with other sources. But the combinations weren't done well, and the result is not a complete census file or a good sample. It's a mix of data with unknown problems and millions of fake records. Because of all this, both the old and new Czech data are no good for death rate studies. They can't give you accurate numbers for population death rates, and they can't back up good comparisons of vaccinated and unvaccinated people. Any study that treats them like real data about the Czech population is going to give you the wrong answers.
In Conclusion the misuse of these Czech datasets shows a bigger issue in COVID-19 studies: the urge to treat any big set of data as the truth, no matter where it came from or how it's put together. If you don't carefully check where the data came from, how it was combined, and who it covers, then any results you get are meaningless. In this case, the problems are so bad that both sets of data need to be thrown out for death rate studies. Keeping on using them just keeps the mistakes going.
The analysis is technically careful in presentation but methodologically flawed in execution. Its strongest conclusions (vaccines caused >200,000 deaths, VAERS underreporting ×10) are not supported by the model structure or data. At best, it identifies correlations that warrant further, more rigorous study using proper time-series or causal inference methods. I appreciate the effort to work with publicly available data and to think critically about COVID-19, excess deaths, and vaccines. But I have a concern that goes beyond the statistical methods.
Analyses like this—especially when framed in dramatic terms—can have the effect of scaring people away from vaccines altogether. If that happens, the next time a pandemic arrives, many will be more vulnerable not because of the virus itself, but because of fear seeded by uncertain or overstated conclusions today. Public health depends on both science and trust, and if trust is shaken, future lives could be put at risk.
That’s why I think it’s important to ask about motivation. Is the aim here to illuminate genuine uncertainties in the data, or to persuade readers toward a conclusion that may not be fully supported? The difference matters.
Critical debate is necessary. But so is care in how results are communicated, especially when they touch on issues that shape life-and-death decisions for millions of people.
Think that horse left the barn. Should we ever see another “pandemic” hopefully we will rely less on vaccines but better treatments. During Covid alternatives to the vaccine were suppressed at scale. A lot of money was involved. There do appear to be medical treatments for Covid. Many for no real money.
Not to discount the military need for biodefense that the mRNA platform may provide. But collateral damage in a war situation is not as acceptable against most natural pathogens.
A model can be questioned on methodology and input data issues BUT should never be questioned/challenged because it might put people off taking a vaccine or any other intervention. Whether the COVID vaccines reduced transmission, symptoms or deaths is another issue which should be discussed openly.
Thanks for your comment. If you can please specify what aspects of the model structure or data don't support the conclusions, I would greatly appreciate it.
The track record of vaccines is so negative, that if people are scared away from them for any reason, that's a good thing. Aluminum, mercury, contaminants, hot lots that kill large groups, major errors in the manufacture, one of "science's failures"
It’s a start but it is not a time series analysis.
At first brush I suspected this was a simple regression rather than time series analysis. So, I looked a bit deeper. My Assessment: The analysis is primarily multiple linear regression with some time-series elements, but it falls short of proper time series analysis. Here’s why:
Method: Linear Regression (fitlm):
The code uses MATLAB’s fitlm function to fit linear models (M0, M1, M2) where excess mortality (EM) is regressed on lagged COVID-19 cases and/or vaccination doses. For example:
M0: EM ~ Cases (lagged) (single regressor).
M1: EM ~ Wave1 + Wave2 + ... + Wave8 (COVID cases split into 8 waves).
fitlm assumes independent and identically distributed (i.i.d.) errors, typical of ordinary least squares (OLS) regression. This is not inherently a time series method unless temporal dependencies are explicitly modeled.
There are some Time-Series Elements: This is good.
Lagged Variables: The code tests multiple lags (0–14 days for cases, 0–10 weeks for vaccines) to find the optimal lag (4 days for cases). This is good and a hallmark of time series analysis.
HAC Adjustment: The use of hac (heteroskedasticity etc) corrects for autocorrelation in residuals, which is a time-series problem.
Wave Decomposition: Splitting COVID cases into 8 waves tries to capture non-linear temporal patterns (e.g., pandemic waves), mimicking a piecewise approach to seasonality or structural breaks.
Where is fall short. Despite these elements, the analysis leans heavily on regression and lacks critical time-series components:
No ARIMA (Time-Series) Model: It doesn’t use dedicated time-series models like ARIMA, SARIMA, or VAR, which model autoregressive (AR), moving average (MA), or seasonal components directly. Instead, it relies on OLS with lagged predictors, which is a simplistic way to handle time dependence.
You must do Stationarity Testing: Time-series data (e.g., cases, deaths) often exhibit non-stationarity. The code doesn’t test for or difference the data, making the regression inaccurate. Like excess deaths and cases likely trend together inflating R².
Manual Wave Definition: The wave periods (inflections at weeks 1, 24, 37, etc.) are manually set, not statistically. Arbitrary not rigorous.
residual Diagnostics: While it plots residuals, it doesn’t systematically test for autocorrelation or model misspecification.
No Dynamic Structure: The model treats lags as fixed predictors, not as dynamic processes limiting its ability to capture evolving relationships.
Thanks for your critical comments. ARIMA/SARIMA models are useless here as the objective was to infer causal associations of exogenous variables (COVID cases, doses) on excess deaths, not forecast or predict future excess deaths based on historical excess deaths. I will look into formal stationarity testing. However, the trends you mention contain the actual signal that we are looking to capture and quantify, and differencing risks removing this signal and amplifying higher frequency noise. Waves were manually defined because it was easier and less time consuming to plot the curve and visually localize the local minimum based on the plot. Coding an algorithm would have taken me longer and would have needed visual verification anyways. If there are specific tests for autocorrelation or misspecification that you think are critically important please let me know and I'll follow up on that. Please also let me know also what you suggest I can do to incorporate dynamic structure to capture evolving relationships. I tried to keep the model as simple and as interpretable as possible, but would like to improve it in a logical fashion. Thank you again.
Thanks for working on these data. Better yet those also in that data space commentary. Sad to imagine the huge numbers within CDC who must have better access to granular data who if engaged in similar analyses yet are not apparently allowed to deviate from the chosen narrative. I suspect them aware of these SubStack articles but perhaps not allowed to participate. A waste of their talents and our funds.
Antibody Dependent Enhancement of viral pathogenicity (ADE) kicked-in about 6 months after "complete vaccination" in UK data series, impacting Delta and Omicron deaths in the "vaccinated".
"Complete vaccination" and "boosting" stimulate IgG4 antibodies, "tolerance antibodies" to spike protein, which leads to milder but longer disease in many cases, and more transmission.
I read the above a went through the STIMPED program and found some issues. Looks like statistical illiteracy is spreading. The analysis uses linear regression (fitlm) with lagged predictors, which isn’t true time series analysis. It lacks stationarity tests, ARIMA modeling, or seasonal decomposition, and risks spurious results due to trends. The HAC adjustments help, but they don’t really address temporal dynamics. A proper time series approach would use ARIMAX or VAR with rigorous diagnostics. It's a start but it's not really a time series analysis.
This issue of whether or not this is a time series is not important and wastes time. It's whether the methodology gives results that can be discussed and analysed. It's a model some are good some are not.
We may differ on our definition of "time-series analysis", mine being a more liberal definition. In my other response I cited reasons for avoiding ARIMA/X and sticking with OLS/HAC instead but you also raise a good point about non-stationary increasing risk of spurious regression results. In previous (unreported) analyses I addressed this in part using sensitivity analyses that detrended the excess deaths by removing yearly averages and also visually inspecting the plots (see Part 2) to ensure results weren't spurious. I will follow up on the other formal tests that you mention as well.
More information will help people but is irrelevant to taking them off the market. The precautionary principal has been met, long ago actually, even before they were put into the market. There were more deaths in the "vaccine" arm in the trials.
I do believe everyone can benefit from as much information as we can gather, but these shots are not legally on the market. They were never legally on the market. They should be pulled immediately and all research continued. But first, stop the harm to others.
Sonia Elija Investigates has this info. "On July 21, 2025, the Informed Consent Action Network (ICAN) announced that it had secured the release of over 600,000 pages of Emergency Use Authorization (EUA) data used by the US Food and Drug Administration (FDA) to authorize and approve Pfizer-BioNTech’s COVID-19 vaccine (BNT162b2), following a successful lawsuit, culminating in a late 2024 court ruling.
These documents, now publicly available on ICAN’s website, are part of a broader release of over 1.6 million pages, including data from the vaccine’s licensure in August 2021 and the earlier EUA in December 2020." https://www.soniaelijah.com/p/the-case-of-the-damning-fda-memos
The above is just one of her many investigations into the clinical trials. She covers both the US and EU. I hope this helps!
5+ years on, it's difficult to believe that humanity believed the biggest lie that had ever been attempted in human history. Covid was man-made to justify a (pretend) cure called a 'vaccine'.
It quickly transpired that the plan was to depopulate the planet while making the Elite organisers of the Scamdemic even richer than they were pe-Scamdemic.
Who wanted to get Genetically Modified with a dangerous experimental injection? That's what's happening with every jab of mRNA poison they call a 'vaccine'.
Remember - the clue is in the PREP Act that shields corrupt vax makers from any/all LIABILITY!
Every mRNA jab continues a program of pre-meditated MASS-MURDER without any culpability.
I have doubts about the quality of the study because every alleged COVID-19 case is supposedly confirmed by a PCR test. It's my understanding that a PCR test cannot identify a virus. There's also the fact that hospitals were paid by Medicare for labeling every health problem, even a broken leg, as a COVID case. So, there appears to be no reliable way to determine the number, if any, of genuine COVID cases.
Thanks for your comment. Here what is important is the shape of the COVID waves, not the number since the model fits the COVID wave shapes to the excess all-cause deaths.
I’m at the end of my rope with non-statisticians mangling statistics every single day. Yesterday it was you—a neuroscientist—pushing out a paper on excess deaths with a so-called “time series” that wasn’t a time series at all. Pure garbage. How would you like it if I published a paper on neuroscience filled with bogus claims about things I know nothing about? Stick to your field. Owning a computer doesn’t make you a statistician. It takes years of graduate study just to master the theory, and years more to learn how to apply it correctly. Yet every clown who’s taken a couple of undergrad stats courses thinks they can crank out valid analyses. They can’t. That’s why research is in shambles: 90% of what’s produced is false. This paper is no exception. It’s wrong from top to bottom—and you don’t even realize it.
Your response from 3 days ago was much more constructive. Please see my response to that comment. Please cite specific reasons where my analysis is wrong or falls short.
Hi
In my view, you allow too little time for 2. shot disaster to appear. Link
https://metatron.substack.com/p/alberta-just-inadvertently-confessed
So you see that 50% of break-thru cases appear in 15 days after 1st shot. But the damage takes more time after 2nd shot, the months 5,6,7 are peaking as breakthru cases.
This guy has a snapshot two months earlier, on nov 4th
https://robertmoloney.substack.com/p/what-the-alberta-covid-19-dashboard
The situation after the first shot is the same, only younger ages have been added.
However, the second shot is still very much evolving, it really jumps within 2 months.
And most likely there is data in the making; 1st shots are done, but there are many people under 5 months afters 2nd shot.
Please note that this Alberta data captures the Delta vawe effect; omicron was a game changer (robert has some graphs on it).
Also, by eye you can see the correlations to hospitalizations and deaths...
JR
If I understand your comment correctly, you are saying the model could be improved if it could incorporate lagged and persistent effects of the vaccine, and separate out effects of the first dose vs. the second dose? If so, I agree, but there are some challenges/limitations to implementing these changes including multicollinearity etc.
Dr. Spiro P. Pantazatos here is the problem with your approach to Columbia University. You need to lead with a threat. Tell them that this shot is poison (you can prove it) and you will see that everyone involved in mandating the shot will be financially and legally held liable for each and every student harmed. Any appeal to morals and 'the right thing to do' is wasted on the sociopaths that find themselves in leadership positions when it comes to making money. Read 'The Allure of Toxic Leadership' by Jean Lipman-Blumen.
Here is an example of presenting consequences versus appeal to morality.
Many decades ago the family community of Rochester, Minnesota was about to get its first strip club. The city council viewed it as a cash cow when it came to taxes. The community was repulsed by the idea and the type of people it would attract. At the city council moral pleas and fears of what a strip club would do to the community were met with yawns and annoyed council members. It wasn't until someone pointed out the strip club was going to be built next to a busy highway and needed a walk-bridge over that busy highway so inebriated customers wouldn't die crossing heavy traffic. This point was followed up with a threat to see each council member held responsible for those deaths.
The result was the strip club did not want to pay for a walk-bridge. Hence, the strip club was never built.
Did you read the latest from Japan? https://www.globalresearch.ca/japan-confirms-600000-citizens-killed-covid-vaccines/5900975 The truth is spilling out!
I greatly admire your skills in both statistical analysis and math. Very impressive charts and graphs -- which means absolutely nothing to someone like me who is, to say the least, arithmetically challenged. Is it possible to express all this in words of three syllables or less? We, the dunces of the world, would appreciate it.
Beyond my ken also. Easier to comprehend is Ed Dowds book, "Cause Unknown" where he proceeds to list hundreds of healthy young people who dropped dead after their covid vaccine.
Post 2 of the series presents some graphs that hopefully help facilitate a more visually intuitive understanding of the effects.
It's a bioweapon.
Thanks. I got that part already.
How much variation did vaccines explain when you didn't allow each COVID wave to have a different term in your model?
Your 2015-2019 average baseline exaggerates excess deaths in 2021 and 2022 relative to 2020. Part of your variation explained by vaccines might actually be due to your inaccurate baseline, because your baseline produces superfluous excess deaths in 2021 and 2022 that happen to partially coincide with vaccination waves. And because your model has a different term for each COVID wave, it allows the weight of COVID waves in 2021 and 2022 to be reduced in order to accommodate a higher weight to vaccines.
In your first plot which shows excess deaths in the CDC dataset, there's no week where the excess mortality is even close to zero after the first few weeks of 2020. However at Mortality Watch if you plot ASMR with a 2010-2019 linear baseline, there's even a few weeks with negative excess mortality in March and April of 2022: https://www.mortality.watch/explorer/?c=USA&ct=weekly&df=2020%2520W01&bm=lin_reg.
You wrote that the CDC dataset had a total of 1,743,770 excess deaths in 2020-2022. When I downloaded the CDC dataset, I got the same result for MMWR weeks in 2020 and 2022 as a whole when I looked at the column "Number above average (unweighted)". I got 585,409 excess deaths on MMWR weeks in the year 2020, 670,667 in 2021, and 487,694 in 2022:
t=fread("AH_Excess_Deaths_by_Sex__Age__and_Race_and_Hispanic_Origin_20250211.csv")
t[Sex=="All Sexes"&RaceEthnicity=="All Race/Ethnicity Groups"&AgeGroup=="All Ages",sum(`Number above average (unweighted)`),MMWRyear]
However when I used my own more accurate method to calculate excess deaths where I multiplied the 2010-2019 linear trend in CMR for each age by the mid-year resident population estimates of the age, I got only about 1.27 million excess deaths in 2020-2022: sars2.net/rootclaim.html#Table_of_excess_deaths_by_cause. I got about 468,885 excess deaths in 2020, 515,125 in 2021, and 285,019 in 2022. So the CDC dataset had about 117,000 more excess deaths in 2020, 156,000 in 2021, and 203,000 in 2022, so the CDC dataset exaggerated excess deaths each year but it was particularly bad in 2022.
When individual COVID waves are not modeled, the doses term is not significant (p=0.16) and it explains only an additional 0.4% variance. Omitted variable bias may be masking the effect of the doses in this model. You raise a valid point about the baseline not taking into account any pre-pandemic yearly trends. In terms of your next sentence, "And because your model has a different term for each COVID wave, it allows the weight of COVID waves in 2021 and 2022 to be reduced in order to accommodate a higher weight to vaccines." is a bit less clear. When the doses term is added, I see the w5 and w6 have lower weights, but the w7 and w8 have higher weights. If the vaccines are a better fit to excess deaths then I would expect the individual COVID wave weights to change a bit. I wasn't able to easily locate your (more conservative and probably accurate) excess death calculations in your second link, but I did see another CDC spreadsheet that appear to correct for yearly (and seasonal) trends (https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm). Do you have any comments on the methods the CDC uses for this spreadsheet in relation to your method for calculating the excess deaths?
On the website by CDC you linked, the dataset under "Download Data > CSV Format > National and State Estimates of Excess Deaths" has about 1.31 million excess deaths on weeks ending in 2020-2022, so it seems to be more accurate than the CDC dataset you used which had about 1.74 million excess deaths on MMWR years 2020-2022:
e=fread("Excess_Deaths_Associated_with_COVID-19.csv")
e[Type=="Unweighted"&State=="United States"&`Week Ending Date`%like%"202[012]",sum(`Observed Number`)-sum(`Average Expected Count`)] # 1312316 (unweighted)
e[Type=="Predicted (weighted)"&Outcome=="All causes"&State=="United States"&`Week Ending Date`%like%"202[012]",sum(`Observed Number`)-sum(`Average Expected Count`)] # 1312344 (weighted)
The code above shows that the excess deaths were nearly the same using the weighted and unweighted figures, because the weighting was used to impute deaths that were missing due to a registration delay, but the last version of the CDC dataset was published in 2023 when only a few deaths were still missing in 2020-2022 because of a registration delay.
But anyway there's still something weird even with CDC's more sophisticated method of calculating the baseline, because on CDC's excess_deaths.htm if you look at the plot titled "Weekly number of deaths (from all causes)", the actual deaths are below the baseline on almost every week of 2018 and 2019: https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm. But on the other hand the plot seems correct in the respect that it has weeks with negative deaths in March and April 2022.
But in either case it would've probably been better to use one of the other datasets by the CDC that used the more sophisticated method to calculate the baseline rather than the dataset which used a simple 2015-2019 average baseline.
The dataset you used was titled "AH Excess Deaths by Sex, Age, and Race and Hispanic Origin". I didn't even find it linked on CDC's main excess_deaths.html page but only here: https://www.cdc.gov/nchs/covid19/covid-19-mortality-data-files.htm. A note on the same page says: "AH = Ad-hoc. Datasets with the prefix AH are not updated routinely, but can be updated upon request."
---
I posted a more detailed response here: sars2.net/rootclaim2.html#Linear_regression_model_by_Spiro_Pantazatos.
I think I figured out why you didn't include a model where you compared cases plus vaccines against excess deaths without allowing each COVID wave to have its own term, which is that it gave you a negative coefficient for vaccines. Or at least that's what happened to me when I tried to reproduce your models.
And when I compared COVID deaths plus vaccines against excess deaths, it also gave me a negative coefficient for vaccines, regardless of whether I took the excess deaths from the same ad-hoc CDC dataset you used or I used my own more accurate method of calculating excess deaths.
If you would've done the models directly based on COVID deaths instead of cases, you wouldn't even have needed a separate term for each COVID wave, because it would've taken into account how different COVID waves had different CFR values. If your goal was to evaluate the contribution of vaccines to excess deaths beyond the deaths attributed to COVID, then it seemed like a weird approach to do the model indirectly based on COVID cases rather than directly based on COVID deaths.
So did you in fact also do models based on COVID deaths, but you didn't publish them because they gave you a negative coefficient for vaccines?
I forgot to ask: Can you list the exact ranges of MMWR weeks you used for each COVID wave so I can reproduce your model?
01/04/2020 - 06/06/2020: w1
06/13/2020 - 09/05/2020: w2
09/12/2020 - 03/06/2021: w3
03/13/2021 - 06/26/2021: w4
07/03/2021 - 10/30/2021: w5
11/06/2021 - 04/02/2022: w6
04/09/2022 - 10/22/2022: w7
10/29/2022 - 12/31/2022: w8
Thanks, 5 of my weeks were off by 2 and 1 week was off by 1.
I'm now getting an r^2 of about 0.780 without the vaccine term and 0.789 with the vaccine term. So I still don't understand what I'm doing different from your calculation:
library(data.table);library(MMWRweek)
waves=data.table(year=rep(2020:2022,c(3,3,2)),week=c(1,24,37,10,26,44,14,43))
owid=fread("owid-covid-data.csv")[location=="United States"]
vax=owid[,.(MMWRweek(date),vax=new_vaccinations)][,.(vax=sum(vax,na.rm=T)),.(year=MMWRyear,week=MMWRweek)]
case=fread("daily-new-confirmed-covid-19-cases.csv")[Entity=="United States"]
case=case[,.(MMWRweek(Day+8),cases=.SD[[3]])][,.(cases=sum(cases)),.(year=MMWRyear,week=MMWRweek)]
excess=fread("AH_Excess_Deaths_by_Sex__Age__and_Race_and_Hispanic_Origin_20250211.csv")
excess=excess[Sex=="All Sexes"&RaceEthnicity=="All Race/Ethnicity Groups"&AgeGroup=="All Ages"]
excess=excess[,.(excess=`Number above average (unweighted)`,year=MMWRyear,week=MMWRweek)]
me=merge(merge(excess,case,all=T),vax,all=T)[year%in%2020:2022];me[is.na(me)]=0
me$wave=me[,findInterval(year*100+week,waves[,year*100+week])]
me=cbind(me,`colnames<-`(sapply(1:8,\(i)me[,ifelse(wave==i,cases,0)]),paste0("w",1:8)))
summary(lm(excess~w1+w2+w3+w4+w5+w6+w7+w8,me)) # r^2 is about 0.780
summary(lm(excess~w1+w2+w3+w4+w5+w6+w7+w8+vax,me)) # r^2 is about 0.789
Thanks for double checking and the replication attempt. I was not as precise as you with the lag (I did not apply an 8-day lag to the daily cases prior to down sampling to weekly resolution which seems a better approach), plus there may be an error in my code which I will double check tomorrow. The OWID website says this about the daily case count: “In addition, there is a delay between testing, confirming, and reporting a case to international organizations. This means the numbers do not necessarily reflect the number of cases on the specific date.” but I don’t see any more details about how long of a delay there is. What happens if you don’t apply any lag to the cases, or just apply 3 or 4 day lag? I’ll debug some more on my end and be back in touch shortly.
I now used these as the starting weeks of each wave after I had shifted the cases 8 days forwards: 2020 week 1, 2020 week 24, 2020 week 39, 2021 week 12, 2021 week 27, 2021 week 46, 2022 week 16, and 2022 week 45.
However in a model where I included a separate term for each wave, my r^2 value was about 0.780 regardless of whether I added a term for vaccine doses or not: sars2.net/rootclaim2.html#Linear_regression_model_by_Spiro_Pantazatos.
So I don't know what I did different from your calculation. This time I even took data for cases from the OWID dataset "Daily new confirmed COVID-19 cases per million people", which showed a moving average of cases per capita with a daily precision, even though earlier I took the weekly number of cases from the new_cases column in the file owid-covid-data.csv.
After having slept on it, the 2015-2019 average baseline is not necessarily less “accurate”: it is reporting excess deaths relative to a 2015-2019 averaged baseline. If I understand correctly, your approach and the more sophisticated CDC adjustment assumes that mortality rates were on an upward trend from 2015 through 2019, that this same upward trend would have continued in years 2020-2022, and that this upward trend has nothing to do with the excess deaths attributable to the pandemic. The first assumption appears questionable in light of a virtually flat age-adjusted death rate from 2013 to 2018 (see https://www.cdc.gov/nchs/data-visualization/mortality-trends/index.htm) . When I squint the line even suggests a slight downward trend from 2017-2018. Moreover, when you look at the first graph of the post (excess deaths across time), the yearly linear trend appears to decrease from 2020 to 2022 based on the “troughs” being lower in each subsequent year. Also, I wouldn’t expect the excess deaths to ever go to zero or be negative in this period given that US COVID cases were always positive from the beginning of the first wave in 2020 through the end of 2022. Even if we assume the yearly trend in excess deaths was increasing from 2013 through 2019, and even if we assume it would have increased at the same rate in 2020-2022 in the absence of the pandemic, I don't think it doesn’t make sense to try and “remove” that trend because the same reasons that would have caused mortality rates to increase steadily in 2013-2019 (i.e. increasing obesity, chronic disease etc.) would be the same factors that exacerbate and contribute to the pandemic excess deaths (i.e. COVID comorbidities).
Regarding your interesting alternative approach of using COVID deaths, instead of COVID cases, as a predictor of all-cause excess deaths: first, I would not trust the accuracy of a COVID deaths regressor given the varying definitions of ‘COVID deaths’ (i.e. dying with, or of COVID) and variability in physicians’ thresholds (and possible hospital incentives), across time and between states and hospitals, for adjudicating a ‘COVID death’; second, given the increased risk of both COVID infection and death in the first weeks post-injection (see e.g. https://metatron.substack.com/p/alberta-just-inadvertently-confessed ), the COVID death regressor may actual include a substantial number of vaccine-caused deaths, and would result in model misspecification to the extent that vaccine deaths are conflated with COVID deaths, and third, it is counterintuitive (to me at least) to use one type of death (which is a subset of the main outcome variable) to predict all-cause deaths and is not typical of epidemiological time series models that aim to measure the effects of environmental *exposures* that may *contribute* excess deaths. Compared to COVID deaths, COVID cases were measured relatively consistently throughout the pandemic, and while they did not accurately reflect the true infection rate, they consistently undershot the actual infection rate and this underascertainment bias doesn’t really affect in the model because it’s the shape of the wave that matters and each wave is scaled anyways to fit the outcome variable.
In response to your last question, no, I did not also do models with COVID deaths. I did not previously think of this as a modeling approach until you mentioned it. I think it is an interesting approach to consider. However, for the above reasons, I think modeling COVID cases is a better modeling strategy.
The page you linked shows an age-standardized mortality rate and not a raw number of deaths. You can see here that ASMR was roughly flat in the 2010s but the raw number of deaths went up in the 2010s: https://www.mortality.watch/explorer/?c=USA&t=asmr&df=1999, https://www.mortality.watch/explorer/?c=USA&t=deaths&df=1999. And even before COVID the raw number of deaths was projected to start increasing even more steeply in the 2020s than the 2010s: https://www.census.gov/library/stories/2017/10/aging-boomers-deaths.html.
You said there is an elevated risk of COVID death in the first weeks following vaccination. But in the English ONS data during each of the first four months of 2021, unvaccinated people had higher COVID ASMR than people who had received the first dose less than 21 days ago. During other months the number of deaths in the group "First dose, less than 21 days ago" was so small that the COVID ASMR was not listed. Data for the first three months of 2021 was excluded from the last two editions of the dataset which were based on the 2021 census, but it's still available from the earlier editions: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/deathsinvolvingcovid19byvaccinationstatusengland/deathsoccurringbetween1january2021and31may2022.
Yes you are right that raw excess deaths will increase if there are more older people in the population. But age is also a significant predictor of COVID death so I'm still not clear on the motivation behind removing the linear trend for the baseline. At any rate, I meant to say that both infection risk and risk of death (from side effects such as myocarditis/heart attack or stroke which could also be classified as "COVID deaths" if they also tested positive for COVID) increase in the first few weeks post-injection. I'm not too familiar with the ONS data and their methods, but when I took a first pass at the ONS data that you sent (I clicked on "Deaths by vaccination status, England" in Section 2 of the link you sent, and then downloaded and opened Table 1, Column G of the XLS dataset), it actually shows a consistently *higher* all-cause ASMR for "First dose, less than 21 days ago" and "First dose, at least 21 days ago" vs. "Unvaccinated" beginning in June, 2021 onwards, and especially in 2022. When I look at "Deaths involving COVID-19", I also see many months where "First dose, at least 21 days ago" has higher ASMR than "Unvaccinated" from May 2021 onward, but what doesn't make sense to me is why the "First dose, less than 21 days ago" has so few counts while the "First dose, at least 21 days ago" has many more counts in the same month...does this make any sense to you? If enough time passed so that there were enough people that died from more than 21 days since their 1st dose, doesn't that mean there should also be a fair number of deaths among people less than 21 days since the first dose? Or am I missing something?
After the first half of 2021 there were no longer many new people getting vaccinated and especially not among the age groups that have the highest risk of death from COVID, so there's obviously not that many deaths under "First dose, less than 21 days ago". But the group "First dose, at least 21 days ago" even includes people who got vaccinated a year ago. There's also a reduced number of all-cause deaths in the first 3 weeks after vaccination, so even after adjusting for observation time, the group "First dose, less than 21 days ago" has much lower all-cause ASMR than the group "First dose, at least 21 days ago".
And also after the second dose was rolled out, the ASMR of the group "First dose, at least 21 days ago" shot up because the healthy vaccinees moved under the second dose but the so-called "unhealthy stragglers" remained under the first dose. A similar phenomenon can also be seen in the Czech record-level data: sars2.net/czech.html#Plot_for_ASMR_by_dose_and_date.
The reason why the group "First dose, less than 21 days ago" has higher all-cause ASMR than unvaccinated people after early 2021 might be because of a phenomenon I'm calling the "late vaccinee effect", where people who got vaccinated late after the main rollout wave was over had higher mortality than people who got vaccinated during the main rollout wave.
A similar phenomenon exists in the record-level datasets from the Czech Republic and Connecticut and in the UKHSA FOIA data that was given to Clare Craig in May 2024:
sars2.net/czech.html#Triangle_plot_for_excess_mortality_by_month_of_vaccination_and_month_of_death, sars2.net/connecticut.html#ASMR_by_month_of_vaccination, sars2.net/uk.html#Mortality_rate_by_week_of_vaccination_up_to_the_end_of_2022.
One explanation for the late vaccinee effect might be if the late vaccinees are less conscientious than people who got vaccinated on time so they might also have poorer health, or another explanation might be if the late vaccinees include people who had some kind of a sickness during the main rollout so they had to delay vaccination.
But if you look at total ASMR across all months of vaccination, then people who have gotten the first dose less than 3 weeks ago should have much lower ASMR than unvaccinated people: sars2.net/czech2.html#Excess_mortality_by_weeks_after_vaccination.
In this Czech dataset there's a total of 43,633 COVID deaths listed, but only 13 of them occurred on the same week number as the week of the first vaccination (https://www.nzip.cz/data/2135-covid-19-prehled-populace):
t=fread("Otevrena-data-NR-26-30-COVID-19-prehled-populace-2024-01.csv")
t[Umrti!="",.N] # 43633 (total COVID deaths)
t[Umrti!=""&Umrti==Datum_Prvni_davka,.N] # 13 (COVID deaths on week of first dose)
Considering that many people got vaccinated during the COVID wave in early 2021, there's actually surprisingly few COVID deaths on the week of the first dose. (But of course the average observation period here doesn't consist of a full 7 days, because if for example someone got vaccinated on Friday at midday, then they had only 2.5 days left to die on the same week when they got vaccinated. But assuming that people usually got vaccinated during weekdays and during working hours, the average exposure time might be around 4.5 days.)
But actually upon further thought, maybe it's not surprising that there's a low number of COVID deaths on the week of vaccination, because people normally didn't get vaccinated immediately after they had been diagnosed with COVID, and people usually took at least a few days after a diagnosis to die from COVID.
You better read this:
Why You Can't Use Czech COVID-19 Data for Death Rate Studies Fri Sep 19 14:14:48 2025
Intro
Over the last few years, two big sets of info from the Czech Republic have been going around in discussions about COVID-19 death rates and vaccine safety. Some folks have used these files, called the old data and new data, like they're official death records from the government. But really, you can't trust either of these sets of data for looking at death rates. They both have big problems with where the data came from and how it's put together, so they're not good for figuring out death rates for the whole population or how vaccines work.
The New Data
This Czech data is supposed to be about everyone in the country, but where it came from tells a different story. Instead of being from the official census or population records, it's mostly from health insurance records. Basically, it's from the National Register of Reimbursed Health Services (NRHZS), plus records of diseases and vaccines. This might have info on shots, but it's not the same as official death data from the census. Using insurance data messes things up a lot. Insurance can include people who don't live there permanently, like foreign workers, visitors, or temporary residents. When you mix them into data that's supposed to be about the whole Czech population, you get inflated numbers and tons of records that are questionable. Instead of showing the real population, the data becomes a mix of citizens, residents, and insured folks just passing through. Also, it looks like the insurance data was mixed with census or registry files in a messy way. It seems like they just did a cartesian join combined everything without being careful, which makes records multiply and creates duplicates. That's why the data has significantly more records than the actual Czech population. Mixing like this might keep the vaccine info, but it ruins the data for death rate stats. If you treat this data like it's a proper population count, you're going to be wrong. The worst part is that you can't even find and get rid of the bad records.
The Old Data
The old data isn't any better. People say it's from the Institute of Health Information and Statistics of the Czech Republic (ÚZIS), but it's not clear where the data really came from. It looks like ÚZIS started with census data but then had to add in the insurance files for vaccine details. Again, the mixing and merging messes the files up. So, this file is also partly from insurance, and it has the same issues as the new data like uncertain population numbers, possible duplicates, and non-residents. Even without knowing exactly where it came from, there are weird things in the file like strange death counts and problems across different groups of people. This makes it clear that you can't trust it for death rate studies.
Why Neither Set of Data Works for Death Rate Studies
The main issue is that people think these files are like official population records, but they're not. Census offices don't gather vaccine info. To make these files, health insurance records had to be combined with other sources. But the combinations weren't done well, and the result is not a complete census file or a good sample. It's a mix of data with unknown problems and millions of fake records. Because of all this, both the old and new Czech data are no good for death rate studies. They can't give you accurate numbers for population death rates, and they can't back up good comparisons of vaccinated and unvaccinated people. Any study that treats them like real data about the Czech population is going to give you the wrong answers.
In Conclusion the misuse of these Czech datasets shows a bigger issue in COVID-19 studies: the urge to treat any big set of data as the truth, no matter where it came from or how it's put together. If you don't carefully check where the data came from, how it was combined, and who it covers, then any results you get are meaningless. In this case, the problems are so bad that both sets of data need to be thrown out for death rate studies. Keeping on using them just keeps the mistakes going.
The analysis is technically careful in presentation but methodologically flawed in execution. Its strongest conclusions (vaccines caused >200,000 deaths, VAERS underreporting ×10) are not supported by the model structure or data. At best, it identifies correlations that warrant further, more rigorous study using proper time-series or causal inference methods. I appreciate the effort to work with publicly available data and to think critically about COVID-19, excess deaths, and vaccines. But I have a concern that goes beyond the statistical methods.
Analyses like this—especially when framed in dramatic terms—can have the effect of scaring people away from vaccines altogether. If that happens, the next time a pandemic arrives, many will be more vulnerable not because of the virus itself, but because of fear seeded by uncertain or overstated conclusions today. Public health depends on both science and trust, and if trust is shaken, future lives could be put at risk.
That’s why I think it’s important to ask about motivation. Is the aim here to illuminate genuine uncertainties in the data, or to persuade readers toward a conclusion that may not be fully supported? The difference matters.
Critical debate is necessary. But so is care in how results are communicated, especially when they touch on issues that shape life-and-death decisions for millions of people.
Think that horse left the barn. Should we ever see another “pandemic” hopefully we will rely less on vaccines but better treatments. During Covid alternatives to the vaccine were suppressed at scale. A lot of money was involved. There do appear to be medical treatments for Covid. Many for no real money.
Not to discount the military need for biodefense that the mRNA platform may provide. But collateral damage in a war situation is not as acceptable against most natural pathogens.
A model can be questioned on methodology and input data issues BUT should never be questioned/challenged because it might put people off taking a vaccine or any other intervention. Whether the COVID vaccines reduced transmission, symptoms or deaths is another issue which should be discussed openly.
Thanks for your comment. If you can please specify what aspects of the model structure or data don't support the conclusions, I would greatly appreciate it.
It's not a time series analysis. The author is not statistically literate.
The post analyzed a time-series, which makes it a time-series analysis. Are you saying the outcome variable here is not a time-series?
The track record of vaccines is so negative, that if people are scared away from them for any reason, that's a good thing. Aluminum, mercury, contaminants, hot lots that kill large groups, major errors in the manufacture, one of "science's failures"
It’s a start but it is not a time series analysis.
At first brush I suspected this was a simple regression rather than time series analysis. So, I looked a bit deeper. My Assessment: The analysis is primarily multiple linear regression with some time-series elements, but it falls short of proper time series analysis. Here’s why:
Method: Linear Regression (fitlm):
The code uses MATLAB’s fitlm function to fit linear models (M0, M1, M2) where excess mortality (EM) is regressed on lagged COVID-19 cases and/or vaccination doses. For example:
M0: EM ~ Cases (lagged) (single regressor).
M1: EM ~ Wave1 + Wave2 + ... + Wave8 (COVID cases split into 8 waves).
M2: EM ~ Wave1 + Wave2 + ... + Wave8 + Doses (adds vaccination doses).
fitlm assumes independent and identically distributed (i.i.d.) errors, typical of ordinary least squares (OLS) regression. This is not inherently a time series method unless temporal dependencies are explicitly modeled.
There are some Time-Series Elements: This is good.
Lagged Variables: The code tests multiple lags (0–14 days for cases, 0–10 weeks for vaccines) to find the optimal lag (4 days for cases). This is good and a hallmark of time series analysis.
HAC Adjustment: The use of hac (heteroskedasticity etc) corrects for autocorrelation in residuals, which is a time-series problem.
Wave Decomposition: Splitting COVID cases into 8 waves tries to capture non-linear temporal patterns (e.g., pandemic waves), mimicking a piecewise approach to seasonality or structural breaks.
Where is fall short. Despite these elements, the analysis leans heavily on regression and lacks critical time-series components:
No ARIMA (Time-Series) Model: It doesn’t use dedicated time-series models like ARIMA, SARIMA, or VAR, which model autoregressive (AR), moving average (MA), or seasonal components directly. Instead, it relies on OLS with lagged predictors, which is a simplistic way to handle time dependence.
You must do Stationarity Testing: Time-series data (e.g., cases, deaths) often exhibit non-stationarity. The code doesn’t test for or difference the data, making the regression inaccurate. Like excess deaths and cases likely trend together inflating R².
Manual Wave Definition: The wave periods (inflections at weeks 1, 24, 37, etc.) are manually set, not statistically. Arbitrary not rigorous.
residual Diagnostics: While it plots residuals, it doesn’t systematically test for autocorrelation or model misspecification.
No Dynamic Structure: The model treats lags as fixed predictors, not as dynamic processes limiting its ability to capture evolving relationships.
Thanks for your critical comments. ARIMA/SARIMA models are useless here as the objective was to infer causal associations of exogenous variables (COVID cases, doses) on excess deaths, not forecast or predict future excess deaths based on historical excess deaths. I will look into formal stationarity testing. However, the trends you mention contain the actual signal that we are looking to capture and quantify, and differencing risks removing this signal and amplifying higher frequency noise. Waves were manually defined because it was easier and less time consuming to plot the curve and visually localize the local minimum based on the plot. Coding an algorithm would have taken me longer and would have needed visual verification anyways. If there are specific tests for autocorrelation or misspecification that you think are critically important please let me know and I'll follow up on that. Please also let me know also what you suggest I can do to incorporate dynamic structure to capture evolving relationships. I tried to keep the model as simple and as interpretable as possible, but would like to improve it in a logical fashion. Thank you again.
Not to mention (which you didn't) the probable extreme bias in the Covid death data. Nice work.
What I miss is the fact that so many reports from jabbed people about there following C19-infection is not reflected in this calculation.
Thise reports indicate a self-promoting effect of infections after the "protection" by the shots.
In Post 2 I speculate briefly how the bump in w4 might have been a "mini" COVID wave causes by the peak in vaccinations in April, 2021.
Thanks for working on these data. Better yet those also in that data space commentary. Sad to imagine the huge numbers within CDC who must have better access to granular data who if engaged in similar analyses yet are not apparently allowed to deviate from the chosen narrative. I suspect them aware of these SubStack articles but perhaps not allowed to participate. A waste of their talents and our funds.
Antibody Dependent Enhancement of viral pathogenicity (ADE) kicked-in about 6 months after "complete vaccination" in UK data series, impacting Delta and Omicron deaths in the "vaccinated".
"Complete vaccination" and "boosting" stimulate IgG4 antibodies, "tolerance antibodies" to spike protein, which leads to milder but longer disease in many cases, and more transmission.
I read the above a went through the STIMPED program and found some issues. Looks like statistical illiteracy is spreading. The analysis uses linear regression (fitlm) with lagged predictors, which isn’t true time series analysis. It lacks stationarity tests, ARIMA modeling, or seasonal decomposition, and risks spurious results due to trends. The HAC adjustments help, but they don’t really address temporal dynamics. A proper time series approach would use ARIMAX or VAR with rigorous diagnostics. It's a start but it's not really a time series analysis.
This issue of whether or not this is a time series is not important and wastes time. It's whether the methodology gives results that can be discussed and analysed. It's a model some are good some are not.
We may differ on our definition of "time-series analysis", mine being a more liberal definition. In my other response I cited reasons for avoiding ARIMA/X and sticking with OLS/HAC instead but you also raise a good point about non-stationary increasing risk of spurious regression results. In previous (unreported) analyses I addressed this in part using sensitivity analyses that detrended the excess deaths by removing yearly averages and also visually inspecting the plots (see Part 2) to ensure results weren't spurious. I will follow up on the other formal tests that you mention as well.
Much appreciation to Dr. Spiro Pantazatos for doping this research, and for speaking out, and to Steve Kirsch also.
More information will help people but is irrelevant to taking them off the market. The precautionary principal has been met, long ago actually, even before they were put into the market. There were more deaths in the "vaccine" arm in the trials.
I do believe everyone can benefit from as much information as we can gather, but these shots are not legally on the market. They were never legally on the market. They should be pulled immediately and all research continued. But first, stop the harm to others.
Do you remember or have the link to a high quality clinical trial mortality analysis?
Sonia Elija Investigates has this info. "On July 21, 2025, the Informed Consent Action Network (ICAN) announced that it had secured the release of over 600,000 pages of Emergency Use Authorization (EUA) data used by the US Food and Drug Administration (FDA) to authorize and approve Pfizer-BioNTech’s COVID-19 vaccine (BNT162b2), following a successful lawsuit, culminating in a late 2024 court ruling.
These documents, now publicly available on ICAN’s website, are part of a broader release of over 1.6 million pages, including data from the vaccine’s licensure in August 2021 and the earlier EUA in December 2020." https://www.soniaelijah.com/p/the-case-of-the-damning-fda-memos
The above is just one of her many investigations into the clinical trials. She covers both the US and EU. I hope this helps!
5+ years on, it's difficult to believe that humanity believed the biggest lie that had ever been attempted in human history. Covid was man-made to justify a (pretend) cure called a 'vaccine'.
It quickly transpired that the plan was to depopulate the planet while making the Elite organisers of the Scamdemic even richer than they were pe-Scamdemic.
Who wanted to get Genetically Modified with a dangerous experimental injection? That's what's happening with every jab of mRNA poison they call a 'vaccine'.
Remember - the clue is in the PREP Act that shields corrupt vax makers from any/all LIABILITY!
Every mRNA jab continues a program of pre-meditated MASS-MURDER without any culpability.
Unjabbed mick (UK). We live longer.
I have doubts about the quality of the study because every alleged COVID-19 case is supposedly confirmed by a PCR test. It's my understanding that a PCR test cannot identify a virus. There's also the fact that hospitals were paid by Medicare for labeling every health problem, even a broken leg, as a COVID case. So, there appears to be no reliable way to determine the number, if any, of genuine COVID cases.
Thanks for your comment. Here what is important is the shape of the COVID waves, not the number since the model fits the COVID wave shapes to the excess all-cause deaths.
I’m at the end of my rope with non-statisticians mangling statistics every single day. Yesterday it was you—a neuroscientist—pushing out a paper on excess deaths with a so-called “time series” that wasn’t a time series at all. Pure garbage. How would you like it if I published a paper on neuroscience filled with bogus claims about things I know nothing about? Stick to your field. Owning a computer doesn’t make you a statistician. It takes years of graduate study just to master the theory, and years more to learn how to apply it correctly. Yet every clown who’s taken a couple of undergrad stats courses thinks they can crank out valid analyses. They can’t. That’s why research is in shambles: 90% of what’s produced is false. This paper is no exception. It’s wrong from top to bottom—and you don’t even realize it.
Your response from 3 days ago was much more constructive. Please see my response to that comment. Please cite specific reasons where my analysis is wrong or falls short.