Google Flu Trends launched in 2008 and provided estimates of influenza activity for more than 25 countries by aggregating Google search queries of relevant terms. Google shared these search query-based data in conjunction with the U.S. Centers for Disease Control in order to estimate levels of influenza activity over time.
The premise of Google Flu Trends was to combine data from the CDC with search terms containing flu-related information. Researchers proposed that these search data tuned into flu tracking information from the Centers for Disease Control and Prevention, could "nowcast" estimates of flu prevalence. But, the first version of Google Flu Trends was flawed in its data collection and modelling practices. Its methodology was to find the best matches among 50 million search terms to fit 1152 data points. But, since the alogorithm was too broad and the big data often overfitted the case--for instance, mistakenly identifying seasonal search terms, like “high school basketball,” as flu predictions--Google Flu Trends missed 2013's peak flu season by 140 percent. The project now exists as a compilation of historical estimates, though it has also inspired several other similar projects that use social media data to predict disease trends.
The goal of Google Flu Trends was to use search data to monitor health tracking behaviors online, and to reveal the presence of flu-like illness in a population. The intention was to identify disease activity early and respond quickly to reduce the impact of seasonal and pandemic influenza. One report was that Google Flu Trends was able to predict regional outbreaks of flu up to 10 days before they were reported by the CDC (Centers for Disease Control and Prevention). But, because of flaws in the project and analysis, none of these goals were attained to any sufficient standard.