Visual workflow showing images, scanned documents and review decisions supported by computer vision

What is CV?

AI foundations, models and capabilities

CV means Computer Vision. It is the area of AI concerned with helping systems interpret visual information from images, video, scans, documents and camera feeds. A computer vision system might recognise objects, read text from a document, detect damage, compare images, check product quality or flag a safety issue. CV is not the same as OCR, although OCR is one important related capability. In business, computer vision is most useful when visual information already drives a workflow and the organisation wants faster review or more consistent checks.

Reviewed by Jackie, Head of Learning & Development, Levellers · Last reviewed 8 June 2026

What this means

Computer vision is easiest to understand as pattern recognition for visual material. People look at an image and notice what is present, missing, damaged, unusual or important. CV systems try to support parts of that process by extracting useful information from pixels. That may be a label, a bounding box, a confidence score, a text value, a count, a comparison or a prompt for human review.

The term can sound like robotics or autonomous vehicles, but small and mid-sized organisations may meet CV in document capture, ID checks, insurance claims, stock checks, product photography, property inspection, asset management or multimodal AI tools that analyse screenshots and images.

A practical CV project should begin with the business decision. Is the system helping someone read a document, inspect an asset, check a condition, classify a visual case or find exceptions? The answer determines the data, accuracy requirements and review controls.

Why it matters

Computer vision matters because many business processes still depend on people looking. Staff review damaged-goods photos, delivery notes, shelves, forms, IDs, property images, safety footage or marketing assets. This work can be slow and inconsistent, especially when images arrive in different formats and quality levels.

CV can help by turning visual material into structured signals. It can count, classify, compare, crop, flag and route. It can make large image sets searchable. It can help teams find the few cases that need attention rather than manually checking every item. When combined with OCR, NLP or workflow automation, it can move information from a visual source into a business process.

For smaller organisations, the opportunity is usually not to build a bespoke vision model. It is to use CV inside existing tools: document capture, claims platforms, stock systems, quality-control tools, security products or AI assistants. The value depends on whether the tool reduces a real bottleneck and whether people trust the results enough to use them.

The risk is that visual output feels objective. A box around an object or a confidence score can look definitive, but the system may be wrong because of lighting, angle, obstruction, poor training examples or unfamiliar context. CV should support operational judgement rather than quietly replacing it where mistakes matter.

How it works

A computer vision workflow starts with visual input. That input might be a photo, video frame, scanned PDF, screenshot, form, camera feed or image embedded in another system. Quality matters. Low resolution, glare, shadows, blur, compression, poor angle, cluttered backgrounds and inconsistent framing can all reduce performance.

The system then performs one or more tasks. Image classification assigns a label to the whole image, such as "damaged packaging". Object detection finds and locates items inside the image, such as pallets, tools, defects or safety equipment. Segmentation separates parts of an image. OCR extracts text from visual documents or images. Visual comparison checks whether an item resembles a reference image. Multimodal systems combine image understanding with language, so a user can ask questions about a screenshot or photograph.

Training and evaluation depend on examples. If a system is trained or tuned on labelled images, the labels must be consistent and relevant. If the system is bought as a vendor tool, the organisation still needs to test it on its own image conditions. A model that works on clean product images may fail on dim warehouse photos or customer mobile pictures.

Deployment is the operational part. Teams need to decide what happens at each confidence level, when a human reviews, how errors are reported, how images are stored, and who owns monitoring. The workflow should consider access permissions, retention and security, especially where images include people, premises, documents or personal data.

Where it shows up in real workflows

In document processing, CV and OCR can read scanned forms, invoices, delivery notes and application packs. The system may identify the document type, extract visible text and route the file to the right queue. Human review is still needed for low-confidence fields, handwriting, unusual layouts and high-impact decisions.

In insurance or property workflows, CV can help triage photos of damage. A claims team might use it to group roof, vehicle or contents images and flag cases where the photo quality is too poor for assessment. The tool should not be treated as the final assessor where liability, fraud, safety or customer vulnerability is involved.

Common misunderstandings

A common misunderstanding is that computer vision and OCR are the same. OCR focuses on recognising text in images or scanned documents. Computer vision is broader. It can include object detection, image classification, visual comparison, segmentation, motion analysis and multimodal image understanding.

Another misunderstanding is that CV sees the world as people do. It does not. It processes visual patterns based on its design, data and evaluation. It may miss the obvious if the image differs from its training examples. It may also detect a pattern that has no practical meaning.

Leaders may also assume that more images always mean better performance. More data helps only if it is relevant, representative and correctly labelled. Data quality, labelling consistency and test design matter more than volume alone.

Finally, CV is sometimes treated as a surveillance topic only. Surveillance is an important boundary, but many CV uses are not surveillance: document capture, quality checks, product classification and image search are different use cases. Governance should match the actual use, not the acronym.

Risks and boundaries

The biggest CV risks occur when visual interpretation affects people or triggers operational consequences. False positives can create unnecessary investigations, rejected claims, blocked access or unfair treatment. False negatives can miss safety issues, defects, missing stock or fraud indicators. The balance depends on the workflow and the cost of each error.

Images and video may include personal data. Where systems identify or monitor people, the data protection position becomes more sensitive. Biometric recognition is a distinct area and may involve special category biometric data when used for uniquely identifying a person. Not every image-processing workflow is biometric recognition, but leaders should avoid casual assumptions. Purpose, context and capability matter.

Workplace monitoring needs particular care. A camera system that checks whether protective equipment is present is different from a system that tracks individual workers, scores performance or identifies faces. The more the workflow affects people, the stronger the need for transparency, necessity, proportionality, access controls and review.

There are also security and confidentiality risks. Images may show customer documents, whiteboards, stock rooms, screens, addresses or commercially sensitive assets. Vendor tools should be assessed for data handling, retention, model training use, access and deletion. CV should not become an uncontrolled upload route for sensitive material.

What leaders should do next

Start with a low-risk, visually clear workflow. Good candidates include document type classification, image tagging, asset review, stock-photo organisation, or a quality check where human review remains central. Avoid starting with surveillance, identity, staff monitoring or high-impact eligibility decisions unless there is a strong business need and proper governance.

Write down the visual task. Define the input images, acceptable quality, expected output, confidence thresholds, human review rules and escalation points. Use real examples, including bad images, edge cases and unusual formats.

Test the tool in the conditions where it will actually be used. Do not rely on vendor demos with clean images. Review false positives and false negatives separately. Ask who is affected by errors and whether the workflow gives them a route to correction.

Finally, connect CV to the wider information workflow. Extracted text, image labels or flags need somewhere to go: a document management system, case queue, CRM, stock platform or quality log. A vision tool that produces an impressive result but does not change the workflow will not deliver much value.

Have a question or a suggestion, or want to understand how we research and review these guides? Read about our editorial standards and how to reach us.

FAQs

Is OCR part of computer vision?

OCR is closely related to computer vision because it recognises text from visual material such as scans, photographs and PDFs. However, CV is broader than OCR. A computer vision system may classify images, detect objects, compare visual features, identify defects or interpret video. OCR is one important capability within many document and image workflows, but it is not the whole field.

Can computer vision remove manual inspection?

Sometimes it can reduce manual inspection, but leaders should be cautious about full replacement. CV is often strongest as a triage layer that identifies likely issues, low-quality images or cases needing review. Full automation needs much higher confidence, clear error tolerance and strong monitoring. In many practical workflows, the best result is fewer routine checks and better human attention on exceptions.

What makes a computer vision project fail?

Many failures come from poor image quality, inconsistent capture conditions, weak labels, unclear review rules or a mismatch between the demo and the real workflow. The system may work in a controlled test but fail with customer photos, warehouse lighting or unusual product variants. CV also fails when outputs are not connected to a business process.

Sources

NIST: AI Measurement and Evaluation Projects - Computer Vision - NIST context for computer vision technologies that extract information from image and video streams, and for AI measurement and evaluation.
NIST: An overview of computer vision - Foundational computer vision context and historical technical framing.
NIST: Multimedia Language Technologies Group - Context for image processing, image understanding, video processing, visual recognition and multimodal media technologies.
Information Commissioner's Office: Key data protection concepts - biometric data guidance - UK guidance on personal information, biometric data and when biometric data becomes special category biometric data.
Information Commissioner's Office: Biometric recognition - Boundary-setting around biometric recognition and unique identification.
NIST: Artificial Intelligence Risk Management Framework (AI RMF 1.0) - Trustworthy AI, risk management, fairness, privacy, safety, evaluation and accountability themes.

‹ What is ASR?

What is DLP? ›