Mizzou researchers train AI models with social media posts to analysis COVID-19
Social media is where some turn when they get sick, searching for what their symptoms mean and forms of treatment.
People turned to Twitter to share symptoms during the COVID-19 pandemic, and Jiacheng Xie, PhD student in Dong Xu’s lab at Bond Life Sciences Center, saw this online crowd seeking answers as a dataset ripe to draw conclusions. Now, this sharing helps him still keep tabs on the disease using social media after most agencies have put case counting to rest.
Using social media chatter, he created a model that tracks COVID-19 symptoms and recovery cycles, something organizations like the Centers for Disease Control and Prevention is no longer recording.
Xie is using large language models, a type of artificial intelligence that interprets human language, to extract and analyze self-reported patient symptoms from posts. His model can give an accurate analysis of the trends of COVID-19, including some users’ symptoms gathered that weren’t noted by the CDC.
He analyzed 24,316 users who turned to the platform to share their symptoms, and trends emerged from the most commonly recorded symptoms — including fever, headache, cough and fatigue. Xie said use of this model can inform doctors of new COVID-19 cases and new symptoms for the illness.
Doctors are also able to input patient data, making this a consistent hub for recording even as case counts continue to decline. The platform tracks Tweets documenting a patient’s recovery and if a reinfection had occurred, too.
“We think at least the self-reported tweets can provide some insight into clinical studies,” Xie said.
He added that, for example, some users have reported being infected with COVID-19 more than five times, and while this can be mentioned anecdotally, there may not be experimental evidence supporting such recurrence rates yet.
Looking for a challenge
Xie sat at his desk in China, programming websites and debugging code in his first job after graduating from his master’s in computer science. Having entered the corporate world, he felt the work was “not challenging” and was looking for more.
That led him to his current desk and what he believes is a larger, more important purpose. Xie studies bioinformatics and uses his knowledge of programming to create technology to benefit patients. Many of such programs have strong ties to AI, and the COVID-19 tracker isn’t the first of this nature of research for Xie.
Prior, he created a mobile app known as iTongue to offer health recommendations for users based on a snapshot of one’s tongue. The inspiration for the app being traditional Chinese medicine, a tongue image can be inserted in the program and out comes a personalized recommendation for routine health maintenance. Xie said this kind of medicine considers the tongue as a “mirror of the internal state of the body”.
Xie collaborated with doctors on the project, some as far as Shanghai University of Traditional Chinese Medicine, to provide recommendations for basic diagnosis and beyond.
He added that the application’s most distinctive feature is simply not enabling online consultation with doctors as many platforms already do, but rather its AI-powered backend that simulates a doctor’s preliminary diagnostic reasoning.
A multi-use and multi-platform program
While Xie started by using COVID-19 related social posts, the tracker is not limited by disease. It’s also not limited to one social media platform.
“We can also use the same pipeline to track other disease trends,” he said.
Xie’s sights are set on TikTok, Reddit and Instagram for future research with this same model.
He added future applications of this research will have to use a different social platform. Twitter, now known as X, once had very open access for developers but under current ownership it has increased the cost for that access.
Twitter provides an Open API to unlock user data for researchers, once at a cheaper fee. “I don’t think research can pay that price (anymore),” he said.
But Xie’s large language model is evergreen. He added that this model has potential with other illnesses to paint a picture of other disease distribution and symptoms.
“Researchers don’t have to do tireless work,” Xie said. “They can just reuse ours.”