Laeeq Khan, Ph.D.

Blog

Reimagining Social Media Algorithms: My Keynote at Monash University Indonesia
I recently had the incredible honor of delivering the keynote presentation at an event organized by the Monash Data and Democracy Research Hub (MDDRH) at Monash University Indonesia.

As someone deeply invested in the intersection of social media, technology, and audience engagement, this invitation was both humbling and inspiring. The event was organized by Dr. Ika Idris, Co-Director and Associate Professor of Public Policy and Management, and her team at. Monash. The event provided an unparalleled platform to engage with a diverse and dynamic audience on a topic of growing global importance: how democratic principles can guide algorithm design to foster trust, inclusivity, and meaningful engagement. It was particularly meaningful as Dr. Idris is a proud alumna of Ohio University and an active member of the external research team of the SMART Lab, which I have the privilege of directing in the Scripps College of Communication.

The event provided an unparalleled platform to engage with a diverse and dynamic audience on a topic of growing global importance: how democratic principles can guide algorithm design to foster trust, inclusivity, and meaningful engagement.

About the Monash Data and Democracy Research Hub

The MDDRH at Monash University Indonesia is a groundbreaking interdisciplinary initiative. Bringing together expertise in data science, cybersecurity, social and political sciences, public policy, and business, the hub serves as a focal point for understanding the critical role of data in shaping democracy in the digital age. Its mission is ambitious yet essential: to foster data-driven research, boost digital literacy, advance digital democracy, and help shape policies for a more informed and resilient society.

Being invited to speak at an event hosted by such a forward-thinking institution was an honor and a privilege. The MDDRH’s commitment to fostering ethical, responsible, and inclusive data practices aligns closely with my research and professional passions, making this an ideal platform to share insights and strategies for building a better digital future.

Keynote Highlights: Rethinking Social Media Algorithms

My presentation focused on the urgent need to rethink the algorithms driving social media platforms, which are often optimized solely for engagement metrics like clicks, shares, and time spent online. While effective at capturing attention, these algorithms can unintentionally amplify misinformation, sensationalism, and societal polarization—undermining the democratic values they should be supporting.

Key Themes of the Presentation:
1. The Role of Algorithms in Public Discourse
  I discussed how algorithms influence what we see, share, and believe online, and the profound implications of these mechanisms for democratic processes.
2. Challenges of Engagement-Driven Models
  I highlighted the unintended consequences of current algorithmic models, including the spread of misinformation, the formation of echo chambers, and the erosion of public trust in digital platforms.
3. Integrating Democratic Values in Algorithm Design
  Drawing from democratic theory, I proposed a shift toward “democratic algorithms” that prioritize inclusivity, transparency, and civic value over pure engagement metrics.
4. Real-World Examples and Actionable Strategies
  The presentation featured practical steps for technologists, policymakers, and platform users to collaborate on ethical algorithm design. These strategies included participatory governance frameworks, regular bias audits, and initiatives to promote media literacy.
Audience

The audience was as diverse as the hub itself, comprising academics, government officials, policymakers, technologists, and civil society advocates. Their questions, perspectives, and enthusiasm demonstrated the shared commitment to creating a more equitable, informed, and inclusive digital landscape.

A Fulfilling Experience

Delivering this keynote was not only a professional highlight but also a deeply personal and fulfilling experience. I have always enjoyed delivering workshops and presentations because of the human connections they foster. This particular event allowed me to bring my expertise from Ohio University in the USA to a global stage, engaging with professionals and scholars who are at the forefront of addressing these challenges.

What made this experience especially meaningful was the alignment between my research and the practical needs of those working to strengthen democracy in the digital age, in addition to bringing to light the antecendets of trust in the domain of audience engagement. It is always amazing to see how academic insights can inform real-world strategies to tackle pressing societal issues.

Gratitude and Reflection

I am profoundly grateful to the Monash Data and Democracy Research Hub for the gracious invitation and the opportunity to be part of their mission to advance digital democracy. Their dedication to fostering interdisciplinary dialogue and driving innovative solutions is nothing short of inspiring.

This keynote was more than just a presentation—it was a call to action. As the digital age continues to reshape public discourse, the responsibility to ensure that technology serves the common good becomes ever more critical. I look forward to continuing this important conversation and contributing to the shared goal of building resilient digital democracies worldwide.

Thank you to Monash University Indonesia and the MDDRH team for making this experience so memorable and impactful!
December 10, 2024
Technology’s Role in Shaping Political Reality: The Crisis of Trust— Invited Talk
I recently had the pleasure of delivering an invited talk titled, “Technology’s Role in Shaping Political Reality: The Crisis of Trust” at the Paramadina Graduate School of Communication in Jakarta, Indonesia. It was a deeply enriching experience where I explored the intersection of technology, media, and politics with a dynamic audience of students, faculty, and media professionals.

Exploring the Crisis of Trust

The core of my talk revolved around how digital technologies are reshaping political reality and redefining the public’s trust in institutions. I examined how social media platforms, algorithms, and digital narratives influence political communication and trust-building processes in modern democracies. Key themes included:
- Algorithmic Amplification and Political Polarization:
  I discussed how algorithms prioritize sensational and divisive content, fueling political polarization and shaping public opinion.
- Misinformation and Trust Deficits:
  With the proliferation of misinformation, I highlighted the growing crisis of trust in credible news sources and how disinformation campaigns manipulate public discourse.
- Restoring Trust Through Media Literacy and Policy Reforms:
  I emphasized the need for improved media literacy programs and thoughtful policy frameworks that hold tech platforms accountable while preserving freedom of expression.
Engaging Discussions and Thought-Provoking Questions

One of the most rewarding aspects of the session was the interactive Q&A segment, where participants raised insightful questions on pressing global issues like platform accountability, data privacy, and ethical AI governance. We explored how local and global contexts intersect in the fight against misinformation and the future of democratic engagement online.

I was particularly inspired by the thoughtful perspectives shared by students, many of whom are conducting research on similar topics. Their questions reflected a deep understanding of media dynamics and a strong passion for creating positive change.

A Memorable Experience

The warm hospitality extended by the Paramadina Graduate School of Communication made the experience truly unforgettable. The faculty and staff went above and beyond to create a welcoming environment, while the students’ enthusiasm and curiosity made the session both intellectually stimulating and professionally rewarding.

I am grateful for the opportunity to share my research and learn from the perspectives of such a diverse and engaged academic community. Events like this remind me of the importance of global academic collaboration in addressing some of the world’s most pressing communication challenges.
December 9, 2024
Empowering Crisis Management Through Social Media: Insights from the AHA Center Workshop
On December 5, 2024, I had the privilege of delivering a training workshop for the ASEAN Coordinating Centre for Humanitarian Assistance on Disaster Management (AHA Centre) in Batam City, Indonesia (the closest part of Indonesia to Singapore, at a minimum distance of 5.8 km across the Singapore Strait). Invited by the AHA Centre, this was a unique opportunity to bring my expertise to a region where disaster management is critical.

The one-day workshop was attended by an incredible group of 50 professionals from the ten ASEAN Member States—Brunei Darussalam, Cambodia, Indonesia, Lao PDR, Malaysia, Myanmar, the Philippines, Singapore, Thailand, and Vietnam. These participants represented a variety of roles, including logistics, disaster preparedness and response, ICT, finance, knowledge management, and senior management.

Delivering Impactful Training

I have always enjoyed delivering training workshops like this one because of the human connection it fosters with individuals who are at the forefront of critical work. It is both humbling and inspiring to meet and collaborate with people who are dedicated to making a tangible difference in disaster management.

For this workshop, I had the honor of leading the first two sessions:
1. Current Trends and Developments in Social Media – We discussed the evolving social media landscape, from real-time communication tools to the growing role of artificial intelligence, and how these trends can be leveraged to enhance disaster response.
2. Creating Engaging Social Media Content during Crises – This hands-on session focused on practical strategies for crafting impactful content, whether to raise awareness, mobilize resources, or engage with diverse audiences.
A Human-Centered Approach

What I love most about these workshops is how they connect research with practice in such a meaningful way. My academic work on social media and crisis communication directly informed the training materials, allowing me to bridge the gap between theoretical insights and real-world application. It’s amazing to see how these tools and strategies can empower individuals on the ground to save lives and build resilience in their communities.

Gratitude and Reflection

I am deeply grateful to the AHA Centre for inviting me to lead this training and to all the participants for their enthusiasm and engagement throughout the day. The opportunity to bring my expertise all the way from Ohio University to Indonesia and to collaborate with staff working at the forefront of disaster management was both professionally and personally fulfilling.

This workshop is one of many I have delivered over the years, and each one leaves me inspired by the dedication and passion of the people I meet. It is a privilege to contribute to their work, knowing that these conversations and strategies can make a real difference in the lives of those affected by disasters.

Thank you again to the AHA Centre team and the Monash Data Democracy Hub for making this workshop such a rewarding experience. I look forward to continuing to engage with this vital work in the future!
December 5, 2024
Breaking Barriers: MDIA 4011 Students Tackle the Digital Divide
This semester in MDIA 4011: Media and the Digital Divide, our students embarked on a transformative learning journey, diving deep into the multifaceted issues surrounding the digital divide. Working collaboratively in seven teams, they explored diverse topics within this critical field, addressing disparities not only in access but also in media literacy, artificial intelligence, and the broader impacts of social media.

The projects were as varied as they were impactful, examining:
- Digital divide challenges and solutions in West Virginia and Appalachian Ohio, offering community-specific insights and strategies.
- Barriers to digital inclusion faced by Hispanic communities and African American low-income neighborhoods in the United States.
- The intersection of the digital divide and social gender norms in India, highlighting unique cultural and structural challenges.
- The broader implications of the digital divide beyond access, focusing on media literacy and the ethical dimensions of artificial intelligence and social media.
Each team produced a comprehensive report detailing their findings and solutions, and, in an impressive display of creativity and dedication, they also translated their research into engaging videos. These presentations not only demonstrated their grasp of complex issues but also their ability to communicate them effectively to a broader audience.

I am truly inspired by the effort, passion, and thoughtfulness the students put into these projects. Their work reflects a genuine commitment to understanding and addressing the nuances of the digital divide, and I couldn’t be more proud of what they’ve accomplished.

Key Challenges Addressed
1. Access Disparities
  Many communities, such as rural areas in Appalachian Ohio and West Virginia, face significant barriers to accessing reliable internet and technology, exacerbating socio-economic inequalities.
2. Media Literacy Gaps
  Beyond access, there is a lack of education on how to critically engage with digital content, leaving individuals vulnerable to misinformation and unable to leverage digital tools effectively.
3. Cultural and Structural Barriers
  Issues like social norms and systemic challenges in low-income neighborhoods hinder equitable participation in the digital age.
Major Solutions Proposed
1. Localized Infrastructure Development
  Students recommended tailored solutions like community-led broadband initiatives and public-private partnerships to improve digital access in underserved areas.
2. Educational Campaigns on Media Literacy
  Introducing targeted programs in schools and community centers to teach critical digital skills and foster responsible online engagement.
3. Community-Centric Policy Advocacy
  Advocating for inclusive policies that address cultural and structural barriers, such as gender equity programs in tech education and incentives for affordable digital tools in low-income communities.
These insights underscore the importance of collaborative, community-focused approaches to bridging the digital divide!

Bravo to the MDIA 4011 team for bringing these important conversations to life!

Here are videos from the final digital mapping projects in the course:

Detroit’s Digital Divide

Digital Divide in Appalachian Ohio

West Virginia Digital Divide

Digital Divide: African American Communities in the USA

Digital Divide amongst Hispanic Communities in the US

Digital Divide and Healthcare in USA

Digital Divide in India

Sp25_Digital Divide in Bulgaria

Sp25_Digital Divide in Argentina

Sp25_Digital Divide in New Zealand

Sp25_Digital Divide in Pakistan
December 1, 2024
Getting Started with Voyant Tools for Text Analysis
Introduction to Voyant Tools

Voyant Tools is an open-source web-based application designed to simplify text analysis, making it accessible even for those with little or no background in data analytics or computational linguistics. Its interface is intuitive, offering a range of visualization options that allow users to easily explore patterns in their text data. Although its breadth of tools may seem overwhelming initially, Voyant’s design enables users to start with simple analyses and gradually delve deeper into its capabilities.

In this tutorial, we will explore how to get started with Voyant Tools, from uploading your texts to using its primary tools for basic text analysis. By the end of this guide, you’ll be able to perform your own analyses and export the results in various formats.

Getting Started with Voyant Tools

Voyant Tools’ tagline, “see through your text,” highlights its ability to supplement traditional close reading with computational techniques. This approach can help scholars and researchers validate qualitative observations by providing quantitative evidence, identify trends and anomalies in word usage, and facilitate deeper interpretations of large text corpora.

Accessing Voyant Tools

Voyant Tools can be accessed for free at Voyant Tools. Users can analyze their own text collections or use existing corpora available on the platform. Let’s explore how to load your texts and begin your analysis.

Loading Texts into Voyant

Voyant allows multiple ways to input texts for analysis:
1. Pasting Text: Directly paste the text you want to analyze into the provided text box.
2. Using URLs: Enter URLs of webpages or PDFs hosted online, listing each URL on a new line.
3. Uploading Files: Upload documents in formats such as plain text, MS Word, PDF, RTF, HTML, or XML by selecting the “Upload” button. Click “Add” for each document you wish to include and then “Upload” once all files are ready.
4. Pre-existing Text Collections: Voyant offers several preloaded corpora, such as the Humanist Listserv Archives and Shakespeare’s plays, which you can access by selecting “Open” from the drop-down menu.
Once your text is loaded, click on the “Reveal” button to initiate the analysis.

Basic Analysis Tools in Voyant

After uploading your text or corpus, Voyant’s interface will automatically display three primary tools: Cirrus, Summary, and Corpus Reader. These tools provide an initial overview of your data, making it easy to start exploring patterns and trends.

1. Cirrus (Word Cloud)

The Cirrus tool generates a word cloud that visually represents the most frequent terms in your text, with word size indicating their relative frequency. This tool is highly interactive:
- Hover over a word to see its exact frequency in the corpus.
- Click on a word to trigger a dynamic update in other panes, showing trends and contexts for that specific term.
- To filter out common stop words like “the” or “and,” click on the cogwheel icon above the Cirrus tool, select your text’s language, and remove these words for a more insightful analysis.
2. Summary Tool

The Summary pane provides a detailed overview of the corpus, including:
- Total word count and the number of distinct words.
- Vocabulary density, indicating the richness of language in your text.
- Distinctive words that are unique to specific documents within your corpus.
This tool is particularly useful for identifying the overall structure and distinctive characteristics of your text, helping you to pinpoint areas of interest for deeper analysis.

3. Corpus Reader

The Corpus Reader displays the complete text of your corpus. It is designed for an interactive reading experience:
- Clicking on a word in the reader will highlight all its occurrences across the text.
- You can use the search bar at the bottom to locate specific words or phrases within the entire corpus.
This feature is ideal for researchers who wish to perform a close reading in parallel with quantitative text analysis.

4. Trends Tool

The Trends tool visualizes the frequency of words throughout your text or across multiple documents in your corpus. It automatically highlights the five most frequent words, but you can add more words for comparison:
- Clicking on a word in the Cirrus or Reader panes will display its frequency trend in this graph.
- Clicking a point in the graph will sync with the Reader and Context panes, providing immediate context for the word’s use.
This tool is beneficial for examining how word usage changes over time or within different sections of your text.

5. Contexts Tool

The Contexts tool allows you to see words from your corpus in their surrounding context, providing insights into how specific terms are used in various sentences. By expanding each entry, you can gain a more comprehensive understanding of the textual environment in which these words occur.

Exporting Data from Voyant Tools

Voyant Tools makes it simple to export your analysis results. Each pane includes an export option that allows you to save the data in various formats, such as:
- Images of visualizations, which can be used in presentations or publications.
- URLs that link directly to the analysis, enabling easy sharing with collaborators.
- Tab-separated or JSON data for further exploration in other software tools like spreadsheets or statistical analysis programs.
This flexibility in exporting ensures that your work in Voyant can be seamlessly integrated into larger research projects.

Advanced Customization and Embedding

One of Voyant’s standout features is its ability to generate embed codes, allowing you to incorporate interactive visualizations directly into web pages or academic blogs, as demonstrated throughout this post. This makes it a powerful tool for digital humanities projects, where sharing dynamic analyses with a broader audience is crucial.

Additionally, Voyant provides citations for specific analyses, ensuring that any visualization or data output you include in your research is properly credited.

Practical Applications of Voyant Tools

Using Voyant Tools at the early stages of a research project can reveal unexpected patterns and trends that might guide the focus of your study. For example:
- Quantitative confirmation of key themes in a text corpus or a body of work.
- Locating key phrases or words that might be pivotal to your analysis.
- Comparing word usage trends across different authors or genres.
The ease with which Voyant visualizes and quantifies text patterns makes it an invaluable supplement to rudimentary text analysis.

Conclusion

Voyant Tools offers a user-friendly entry point into the world of text analysis, with robust features for more advanced users. Its interactive visualizations, ease of data export, and ability to handle various text formats make it a versatile tool for both beginners and experienced researchers. As you become more comfortable with Voyant, you’ll find it an indispensable companion in uncovering insights from texts and exploring new dimensions of your research.

Happy analyzing!
October 3, 2024
Publication Alert: AI in Higher Education
I’m happy to announce the publication of our latest research paper, “AI in Higher Education: Unveiling Academicians’ Perspectives on Teaching, Research, and Ethics in the Age of ChatGPT,” co-authored with Aqdas Malik, Khalid Hussain, Junaid Qadir, and Ali Tarhini. Our study dives into the transformative impact of conversational AI, specifically ChatGPT, on the landscape of higher education. The full paper is available here: AI in Higher Education.

Exploring AI’s Role in Higher Education

In an era where technology is rapidly reshaping academia, our research aims to understand how conversational AI tools like ChatGPT are influencing teaching, learning, and research in universities worldwide. We adopted a qualitative approach, interviewing 12 accomplished academicians from institutions in North America, Asia, and Europe to gather in-depth insights into this evolving dynamic.

Key Areas of Focus

The study delves into several crucial aspects:
- Teaching and Learning: Exploring how ChatGPT enhances educational productivity, creativity, and knowledge acquisition, while also highlighting concerns around critical thinking and over-reliance on AI.
- Research: Investigating AI’s role in brainstorming, analysis, and idea generation, alongside potential risks to research quality and originality.
- Ethics: Addressing ethical considerations, such as academic integrity and authenticity, that arise from the increasing use of AI in educational settings.
Major Findings

Our analysis revealed that while ChatGPT holds great promise in boosting educational practices, it also brings challenges that need careful management:
1. Enhanced Learning and Creativity: ChatGPT has the potential to revolutionize teaching methods and student engagement, fostering creativity and innovative thinking in academic environments.
2. Concerns Over Academic Integrity: Issues like plagiarism, over-reliance on AI tools, and authenticity emerged as significant challenges that could impact academic honesty and critical learning processes.
3. The Need for Ethical AI Policies: The study strongly advocates for the development of institutional policies that promote the responsible use of AI in higher education, ensuring that AI’s benefits are harnessed without compromising ethical standards.
Implications for Educators and Institutions

Based on our findings, we recommend a proactive approach for educators and policymakers in managing AI’s role in education:
- Training and Skill Development: Faculty training programs are essential to equip educators with the skills to integrate AI effectively and ethically into their teaching and research practices.
- Policy Implementation: Institutions should establish clear guidelines and policies to regulate AI use, maintaining a balance between leveraging technology and preserving academic integrity.
Conclusion

Our research contributes to the broader discussion on AI in academia, highlighting both its transformative potential and its challenges. As we navigate this AI-driven future, our study underscores the importance of responsible AI use, advocating for policies and training that empower educators while safeguarding academic values.

For a detailed exploration of our findings and analysis, you can read the complete paper here: AI in Higher Education: Unveiling Academicians’ Perspectives.

Stay tuned for more insights into the intersection of AI and education as we continue to explore this rapidly evolving field!
September 30, 2024
Essential Data Cleaning Techniques in Excel
Data cleaning is a crucial step in data analysis, ensuring that the dataset you work with is accurate, complete, and ready for meaningful analysis. Microsoft Excel, with its powerful functions and user-friendly interface, offers a range of tools to help you clean and prepare your data efficiently. This tutorial will walk you through the essential techniques for data cleaning in Excel, from handling missing values to removing duplicates and correcting errors.

1. Understanding the Importance of Data Cleaning

Before diving into the specifics, it’s essential to understand why data cleaning is important. Cleaning your data helps:
- Remove inconsistencies that might skew your analysis.
- Improve data accuracy for better decision-making.
- Ensure reliability of the results from any statistical or machine learning models you apply later.
Now, let’s get into the practical steps you can take in Excel to clean your dataset.

2. Removing Duplicate Data

Duplicates can distort your analysis, leading to incorrect conclusions. To remove duplicates in Excel:
1. Select the Data: Highlight the range of cells that contain your data.
2. Go to the Data Tab: Click on the Data tab in the ribbon.
3. Click on “Remove Duplicates”: A dialog box will open, asking you to specify which columns to consider for duplicates.
4. Choose Columns to Check: Select the columns based on which duplicates should be identified (usually all columns) and click OK.
Excel will automatically remove duplicate rows, and it will show you how many duplicates were found and deleted.

3. Handling Missing Data

Missing values can cause problems in data analysis, especially for statistical calculations. There are a few common ways to handle missing data in Excel:

a) Filling Missing Values Manually

If the number of missing values is small, you can fill them manually:
- Click on the cell with the missing value and enter the appropriate data or placeholder (like “N/A” or “0”).
b) Using Excel Functions for Missing Data

If there are many missing values, you can automate the process:
- Use the =IF(ISBLANK(cell), "value", cell) formula to replace blank cells with a specific value.
- Alternatively, use the =AVERAGE(range) to fill in missing numeric data with the average of the surrounding values.
c) Removing Rows with Missing Data

If missing data is widespread in a row and can’t be filled appropriately, you may choose to remove the row:
1. Highlight the rows containing missing data.
2. Right-click and select Delete.
4. Correcting Inconsistent Data

Inconsistent data entry is a common problem, especially when data is manually entered. For example, entries like “NY,” “New York,” and “N.Y.” could all refer to the same entity but might be treated differently in analysis. To correct these:

a) Find and Replace
1. Use the Find & Replace feature by pressing Ctrl + H.
2. Enter the incorrect value in the “Find what” box and the correct value in the “Replace with” box.
3. Click Replace All to correct all instances at once.
b) Text Functions for Consistency

Excel’s text functions can also help standardize data:
- =UPPER(cell): Converts text to uppercase.
- =LOWER(cell): Converts text to lowercase.
- =PROPER(cell): Capitalizes the first letter of each word.
These functions help standardize the format of your text data.

5. Splitting and Merging Data

Data might come in a format that isn’t immediately useful, such as full names or addresses in a single cell. Excel’s Text to Columns and CONCATENATE features can help.

a) Splitting Data
1. Highlight the column with data you want to split.
2. Go to the Data tab and click on Text to Columns.
3. Choose a delimiter (like a comma, space, or tab) that separates your data.
4. Excel will split the data into different columns based on the delimiter.
b) Merging Data

To combine data from multiple cells into one:
- Use the CONCATENATE function or the newer TEXTJOIN function:scssCopy code=CONCATENATE(A1, " ", B1) or =TEXTJOIN(" ", TRUE, A1, B1)
- This formula merges the content of cells A1 and B1 with a space in between.
6. Using Excel’s Built-in Data Cleaning Tools

Excel provides several tools specifically designed to clean data efficiently:

a) Trim Spaces

Leading, trailing, or excessive spaces can affect data analysis, especially when comparing text values. To remove extra spaces:
- Use the =TRIM(cell) function to remove all spaces except single spaces between words.
b) Remove Unwanted Characters

Sometimes data contains unwanted characters like symbols or line breaks. To clean these up:
- Use the =CLEAN(cell) function to remove non-printable characters.
- Combine with =SUBSTITUTE(cell, "unwanted character", "") to remove specific characters from your data.
7. Data Validation

To prevent future errors in your data, set up Data Validation rules in Excel:
1. Highlight the cells where you want to apply the validation.
2. Go to the Data tab and click on Data Validation.
3. Set rules for what type of data is allowed in these cells (e.g., whole numbers, dates, lists).
4. Add error alerts to inform users if they enter invalid data.
Data Validation ensures that new data entries conform to the required format, reducing future inconsistencies.

8. Advanced Data Cleaning Techniques

a) Using Conditional Formatting

Conditional formatting helps highlight cells that need attention:
- Go to the Home tab and click on Conditional Formatting.
- Set rules to format cells based on conditions like duplicate values, errors, or specific text.
b) Using Pivot Tables for Data Integrity

Pivot tables can help check data integrity by summarizing your dataset:
- Create a pivot table to analyze data distributions, spot anomalies, and identify outliers in the data.
This method is useful for quickly identifying where data might need cleaning or further investigation.

9. Automating Data Cleaning with Macros

For repetitive data cleaning tasks, you can automate the process using Excel’s macros:
1. Go to the View tab and click on Macros.
2. Click on Record Macro to start recording your actions.
3. Perform the data cleaning steps.
4. Stop the macro recording and save it for future use.
Macros can save you a lot of time by automating routine data cleaning operations.

10. Finalizing and Documenting Your Data Cleaning Process

To make your dataset ready for analysis:
- Document your steps so you have a record of what transformations were applied.
- Create a backup of the original dataset before making extensive changes.
- Use a version control system to track updates to your cleaned data.
Cleaning Social Media Data in Excel

Cleaning social media data requires specific techniques due to the unstructured nature of content from platforms like Twitter, Facebook, or Instagram. Social media data often contains noise such as hashtags, mentions, URLs, emojis, and inconsistent text formats that need to be standardized. Excel can help streamline this process using functions like SUBSTITUTE to remove unwanted characters (e.g., hashtags and mentions), CLEAN to strip out non-printable characters, and TRIM to eliminate extra spaces. Additionally, text functions like LOWER or UPPER can standardize text case, while FIND and REPLACE allow targeted removal of URLs or specific phrases. By cleaning social media data in Excel, you ensure that your dataset is well-prepared for sentiment analysis, keyword extraction, or other advanced text analytics, enabling more accurate insights into audience behavior and trends.

Conclusion

Data cleaning in Excel is a critical skill that ensures the accuracy and reliability of your data analysis. By mastering these techniques, you’ll be better equipped to handle data issues and streamline your workflow. From removing duplicates and handling missing data to using formulas for consistency and advanced automation, Excel provides all the tools you need to clean and prepare your data for insightful analysis.

With these steps, you’re well on your way to creating datasets that are not only cleaner but also more robust for any analytical tasks you undertake.

Happy cleaning!
September 5, 2024
CFP: Oxford Intersection on Social Media in Society and Culture

June 6, 2024
Data Storytelling Workshop in France

In May 2024, I had the incredible opportunity to visit France and lead workshops on data storytelling for students at the Institute of Technology (IUT) at the University of Angers. As someone deeply invested in the power of data to inform, inspire, and connect, this week turned out to be not only professionally fulfilling but also personally transformative.

From the moment I arrived, I was struck by the hospitality and warmth of my hosts in Angers, a charming city rich in history and culture. But what truly stood out were the students—brilliant, curious, and deeply engaged. Their eagerness to dive into the art of storytelling with data, and the creativity they brought to their projects, was inspiring. Seeing them connect the dots between analytics, narrative, and impact reminded me why I do this work in the first place.

Beyond the workshops, the experience was enriched by the global community that gathered in Angers. With 50 scholars from 17 different countries, the event was a vibrant melting pot of perspectives, ideas, and cultures. It was a joy to exchange insights, form new friendships, and spark research collaborations that I believe will extend well beyond this week.

There’s something uniquely powerful about crossing borders—geographical, academic, and disciplinary—to share knowledge and learn from others. I left France with a full heart, a head buzzing with new ideas, and a deeper appreciation for the global language of data.

To the students, colleagues, and hosts who made this journey so meaningful: merci beaucoup. I look forward to where these connections may lead next.

What truly made this experience unforgettable was the enthusiasm and dedication of the students. From the very first session, they approached each concept with curiosity and a genuine desire to learn. They asked thoughtful questions, engaged in lively discussions, and eagerly explored how to transform data into meaningful narratives. By the end of the workshop series, the students presented their own data storytelling projects, each reflecting not only their growing technical skills but also their creativity and critical thinking. I was genuinely impressed by the depth of their work and the passion they brought to each presentation. Their drive to learn and apply new ideas was both energizing and inspiring.

This visit also provided a unique platform to share my expertise from the SMART Lab at Ohio University with a European academic audience. As a Visiting Scholar at the University of Angers, I had the opportunity not only to lead data storytelling workshops but also to represent Ohio University at the University of Angers’ Study Abroad Fair. This engagement opened doors for meaningful conversations around future collaborations, academic exchanges, and student mobility. It was a proud moment to contribute to expanding global learning opportunities, aligning with Ohio University’s mission to foster international partnerships and enhance its global footprint. Experiences like these are more than academic—they are bridges connecting people, institutions, and ideas across borders.

One of the highlights of the trip was a guided tour of the stunning Château d’Angers, a medieval fortress that stands proudly above the Maine River. With its iconic black-and-white striped towers and rich history dating back to the 9th century. Walking through its ancient halls offered a tangible sense of Angers’ deep historical roots and artistic heritage.

Adding to the cultural immersion, the university organized a treasure hunt across the city, a creative and interactive way for us to get acquainted with Angers’ landmarks, hidden gems, and vibrant streets. It was not only a fun and engaging activity, but also a wonderful way to build camaraderie among participants from across the globe.

May 17, 2024
Publication Alert: Exploring audience engagement with ChatGPT-content on YouTube
In the dynamic world of digital content, a new trend has emerged with significant implications: ChatGPT-related content on YouTube. Our latest journal article is a collaborative research effort between Professors Khalid Hussain, Laeeq Khan, and Aqdas Malik. The study delves into this evolving trend, offering valuable insights for content creators and AI tool developers. After a rigorous blind peer review, the paper was finally published in the prestigious Journal of Digital Business.

With the rise of generative artificial intelligence (AI), particularly ChatGPT, an intriguing academic conversation has unfolded regarding its use across various fields. However, one area that remained unexplored until now was how audiences engage with ChatGPT-related content on YouTube. Our study, therefore, sought to fill this gap by examining the engagement levels such content receives on this popular platform.

The research involved a meticulous analysis of data extracted from 100 YouTube videos focused on ChatGPT, collectively amassing 65 million views. We utilized three application programming interfaces (APIs) for this purpose: VidIQ, Tubebuddy, and SocialBlade. To provide a comparative perspective, we also looked at 200 videos from the same content creators but not related to ChatGPT. Our analysis tools included one-way ANOVA, multigroup Structural Equation Modeling (SEM), and comparative line graphs.

We grounded our study in the Uses and Gratifications (U&G) theoretical framework. This approach helped us understand the motivations behind audience engagement with these videos.

Key Findings
Our findings are revealing in several aspects:
- Increased Engagement with ChatGPT Content: Videos related to ChatGPT garnered significantly higher engagement compared to other content types from the same creators.
- Subscriber Count Less Influential: Interestingly, the ChatGPT-focused content showed less sensitivity to the channel’s subscriber count. Channels with fewer subscribers often saw higher viewership on these videos.
- Boost in New Subscribers: Channels experienced a notable increase in new subscribers when they posted ChatGPT-related content, compared to their other videos.
Implications
The implications of these findings are manifold and extend across various sectors:
- For Content Creators: Understanding the audience’s attraction to ChatGPT-related content can guide the development of more engaging material.
- For AI Tool Developers: Insights into how audiences interact with content about their products can inform future development and marketing strategies.
- For Advertisers and Content Publishers: This study provides valuable data for targeting and content strategy decisions.
This pioneering study not only sheds light on the unique dynamics of audience engagement with ChatGPT-related content on YouTube but also opens the door to further research in this area. It offers critical insights for a wide range of stakeholders, from content creators to AI developers, and highlights the ever-changing landscape of digital content consumption.

The research can be accessed here: https://www.sciencedirect.com/science/article/pii/S2666954423000194

Abstract

The emergence of ChatGPT in the broader field of generative artificial intelligence (AI) has sparked scholarly discourse on its utilization in various disciplines. Yet, a significant void exists in our understanding of the dynamics of consumer engagement with content creators producing ChatGPT-related content. Therefore, the present study aims to delineate how ChatGPT-related content garners consumer engagement on YouTube. Data from 100 YouTube videos amassing an aggregate of 65 million views on ChatGPT were extracted using three application programming interfaces (APIs), namely, VidIQ, Tubebuddy, and SocialBlade. We subsequently contrasted this dataset with data from 200 other videos produced by the same creators. The data were analyzed using one-way ANOVA, multigroup SEM, and comparative line graphs. Employing the Uses and Gratifications (U&G) theoretical framework, our results indicate that innovative content such as ChatGPT-related videos garners more engagement than other content types from the same YouTube channels. Intriguingly, this study finds that ChatGPT-focused content exhibited diminished sensitivity to channel subscriber counts, with channels having fewer subscribers achieving higher viewership numbers. Furthermore, ChatGPT-related content induced a surge in new subscribers to the channel compared to the other content types. The present study pioneers the investigation of audience engagement with ChatGPT-related content by juxtaposing it with other content from the same YouTube channels. We also explicate the relationship between content sensitivity and extant subscriber counts. The present study provides vital insights and implications for a diverse audience, including content creators, developers of AI tools, advertisers, and content publishers.
November 28, 2023