• About
  • Disclaimer
  • Privacy Policy
  • Contact
Monday, July 21, 2025
Cyber Defense GO
  • Login
  • Home
  • Cyber Security
  • Artificial Intelligence
  • Machine Learning
  • Data Analysis
  • Computer Networking
  • Disaster Restoration
No Result
View All Result
  • Home
  • Cyber Security
  • Artificial Intelligence
  • Machine Learning
  • Data Analysis
  • Computer Networking
  • Disaster Restoration
No Result
View All Result
Cyber Defense Go
No Result
View All Result
Home Data Analysis

AI’s Achilles’ Heel: The Information High quality Dilemma

Md Sazzad Hossain by Md Sazzad Hossain
0
AI’s Achilles’ Heel: The Information High quality Dilemma
585
SHARES
3.2k
VIEWS
Share on FacebookShare on Twitter


As AI has gained prominence, all the information high quality points we’ve confronted traditionally are nonetheless related. Nevertheless, there are extra complexities confronted when coping with the nontraditional information that AI usually makes use of.

You might also like

7 Python Internet Growth Frameworks for Information Scientists

Reworking Affected person Referrals: Windfall Makes use of Databricks MLflow to Speed up Automation Throughout 1,000+ Clinics

Intro to Docker Compose – Dataquest

AI Information Has Completely different High quality Wants

When AI makes use of conventional structured information, all the identical information cleaning processes and protocols which have been developed through the years can be utilized as-is. To the extent a corporation already has confidence in its conventional information sources, the usage of AI shouldn’t require any particular information high quality work.

The catch, nonetheless, is that AI usually makes use of nontraditional information that may’t be cleansed in the identical manner as conventional structured information. Consider photos, textual content, video, and audio. When utilizing AI fashions with this kind of information, high quality is as necessary as ever. However sadly, the standard strategies utilized for cleaning structured information merely don’t apply. New approaches are required.

AI’s Completely different Wants: Enter And Coaching

First, let’s use an instance of picture information high quality from the enter and mannequin coaching perspective. Sometimes, every picture has been given tags summarizing what it comprises. For instance, “sizzling canine” or “sports activities automobile” or “cat.” This tagging, sometimes finished by people, can have true errors and in addition conditions the place totally different individuals interpret the picture otherwise. How can we determine and deal with such conditions?

It isn’t straightforward! With numerical information, it’s potential to determine dangerous information through mathematical formulation or enterprise guidelines. For instance, if the value of a sweet bar is $125, we may be assured it may’t be proper as a result of it’s so far above expectation. Equally, an individual proven as age 200 clearly doesn’t make any sense. There actually isn’t an efficient manner as we speak to mathematically verify if tags are correct for a picture. One of the best ways to validate the tag is to have a second particular person assess the picture.

Another is to develop a course of that makes use of different AI fashions to scan the picture and see if the tags utilized seem like appropriate. In different phrases, we are able to use current picture fashions to assist validate the information being fed into future fashions. Whereas there’s potential for some round logic doing this, fashions have gotten sturdy sufficient that it shouldn’t be an issue pragmatically.

AI’s Completely different Wants: Output And Scoring

Subsequent, let’s use an instance of picture information high quality from the mannequin output and scoring perspective. As soon as now we have a picture mannequin that now we have confidence in, we feed the mannequin new photos in order that it may assess the pictures. As an illustration, does the picture include a sizzling canine, or a sports activities automobile, or a cat? How can we assess if a picture supplied for evaluation is “clear sufficient” for the mannequin? What if the picture is blurry or pixelated or in any other case not clear? Is there a method to “clear” the picture?

The boldness we are able to have in what an AI mannequin tells us is within the picture immediately depends upon how clear the picture is. In a case such because the picture above, how do we all know if the picture is a blurred view of timber or one thing else completely? At the same time as people, there’s subjectivity on this evaluation and no clear path for having an automatic, algorithmic strategy to declaring the picture as “clear sufficient” or not. Right here, handbook evaluate is perhaps greatest. In absence of that, we are able to once more have an algorithm that scores the readability of the enter picture together with processes to charge the boldness within the descriptions generated by the mannequin’s evaluation. Many AI purposes do that as we speak, however there’s certainly enchancment potential.

Rising To The Problem

The examples supplied illustrate that traditional information high quality approaches like lacking worth imputation and outlier detection can’t be utilized on to information similar to photos or audio. These new information sorts, which AI is closely depending on, would require new and novel methodologies for assessing high quality each on the enter and the output finish of the fashions. Given it took us a few years to develop our approaches for conventional information, it ought to come as no shock that now we have not but achieved related requirements for the unstructured information which AI makes use of.

Till these requirements come up, it’s essential to:

  1. Consistently scan business blogs, papers, and code repositories to maintain tabs on newly developed approaches
  2. Make your information high quality processes modular in order that it’s straightforward to change or add procedures to make use of the newest advances
  3. Be diligent in learning recognized errors so as to determine if patterns exist associated to the place your cleaning processes and fashions are performing higher and worse

Information high quality has all the time been a thorn within the facet of knowledge and analytics practitioners. Not solely do the standard points stay as AI is deployed, however the totally different information that AI makes use of introduces all types of novel and troublesome information high quality challenges to deal with. These working within the information high quality realm ought to have job safety for a while to come back!

Initially posted within the Analytics Issues newsletter on LinkedIn

The publish AI’s Achilles’ Heel: The Information High quality Dilemma appeared first on Datafloq.

Tags: AchillesAIsDataDilemmaHeelQuality
Previous Post

Microsoft: Attackers Actively Compromising On-Prem SharePoint Buyer

Next Post

AMD Heeds the AI Alternative – IT Connection

Md Sazzad Hossain

Md Sazzad Hossain

Related Posts

7 Python Internet Growth Frameworks for Information Scientists
Data Analysis

7 Python Internet Growth Frameworks for Information Scientists

by Md Sazzad Hossain
July 20, 2025
Reworking Affected person Referrals: Windfall Makes use of Databricks MLflow to Speed up Automation Throughout 1,000+ Clinics
Data Analysis

Reworking Affected person Referrals: Windfall Makes use of Databricks MLflow to Speed up Automation Throughout 1,000+ Clinics

by Md Sazzad Hossain
July 20, 2025
Intro to Docker Compose – Dataquest
Data Analysis

Intro to Docker Compose – Dataquest

by Md Sazzad Hossain
July 19, 2025
What’s large information? Huge information
Data Analysis

What Is Polarity? – Dataconomy

by Md Sazzad Hossain
July 19, 2025
“Flipping the Narrative in ‘Slouching In the direction of Utopia'”:  Counter-narratives going past the default economics mannequin of exponential development
Data Analysis

“Flipping the Narrative in ‘Slouching In the direction of Utopia'”: Counter-narratives going past the default economics mannequin of exponential development

by Md Sazzad Hossain
July 18, 2025
Next Post
How an Unknown Chinese language Startup Stole the Limelight from the Stargate Venture – IT Connection

AMD Heeds the AI Alternative – IT Connection

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Introducing Streaming Observability in Workflows and DLT Pipelines

Introducing Streaming Observability in Workflows and DLT Pipelines

February 17, 2025
Revolutionizing Manufacturing: How AI and IoT Are Altering Predictive Upkeep Eternally

Revolutionizing Manufacturing: How AI and IoT Are Altering Predictive Upkeep Eternally

April 20, 2025

Categories

  • Artificial Intelligence
  • Computer Networking
  • Cyber Security
  • Data Analysis
  • Disaster Restoration
  • Machine Learning

CyberDefenseGo

Welcome to CyberDefenseGo. We are a passionate team of technology enthusiasts, cybersecurity experts, and AI innovators dedicated to delivering high-quality, insightful content that helps individuals and organizations stay ahead of the ever-evolving digital landscape.

Recent

Superior model of Gemini with Deep Suppose formally achieves gold-medal normal on the Worldwide Mathematical Olympiad

Superior model of Gemini with Deep Suppose formally achieves gold-medal normal on the Worldwide Mathematical Olympiad

July 21, 2025
How an Unknown Chinese language Startup Stole the Limelight from the Stargate Venture – IT Connection

AMD Heeds the AI Alternative – IT Connection

July 21, 2025

Search

No Result
View All Result

© 2025 CyberDefenseGo - All Rights Reserved

No Result
View All Result
  • Home
  • Cyber Security
  • Artificial Intelligence
  • Machine Learning
  • Data Analysis
  • Computer Networking
  • Disaster Restoration

© 2025 CyberDefenseGo - All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In