MachinaCheck: multi-agent CNC manufacturability system built on AMD MI300X

SiliconFeed EditorialMay 11, 2026

CNC manufacturing AMD Instinct MI300X Qwen 2.5 multi-agent AI STEP file on-premise AI

Sections and tags — in the Topics menu Search the feed

At a glance:

MachinaCheck is a five-component multi-agent AI pipeline that ingests STEP files and produces a complete manufacturability report in 25–40 seconds — no manual drawing reading required.
The system runs Qwen 2.5 7B entirely on-premise via AMD Instinct MI300X with 192GB HBM3 VRAM, keeping proprietary STEP geometry off third-party API endpoints.
Built by Syed Muhammad Sarmad and Sabari Doss R at the AMD Developer Hackathon in May 2026, the tool replaces 30–60 minutes of manual analysis per drawing for shops fielding 10–20 RFQs per week.

The problem with how small CNC shops evaluate jobs today

Walk into any small CNC machine shop and ask the manager how they decide whether to accept a customer job. The answer is almost always the same: they print the drawing, read every dimension by hand, walk around the shop checking which tools are available, estimate whether their machines can hold the required tolerances, and write notes on a clipboard. The whole process takes 30 to 60 minutes per drawing. For a busy shop receiving 10 to 20 RFQs per week, that is 5 to 20 hours of skilled manager time spent on feasibility analysis alone.

Sometimes they get it wrong. They accept a job, start production, and discover halfway through that they don't have the right tap or that their mill cannot hold the tolerance on a critical feature. The part gets scrapped. The customer is unhappy. The machine time is lost. This is not a niche inconvenience — it is a systemic bottleneck that directly eats into the capacity and profitability of small and mid-size machine shops everywhere.

What MachinaCheck does

MachinaCheck is a multi-agent AI system. You upload a STEP file — the standard CAD format that customers send to machine shops — along with three simple inputs: material type, required tolerance, and any thread specifications. Thirty seconds later you have a complete manufacturability report telling you exactly whether you can make the part, what tools you need, what is missing, and what actions to take before starting production. No manual drawing reading. No walking around the shop. No guesswork.

The pipeline is a five-component architecture built with LangChain and orchestrated via FastAPI. It starts with a pure Python STEP file parser, moves through four AI agents, and ends with a structured report. Every component has a distinct job: extracting geometry, classifying required operations, matching tools against inventory, reasoning about overall feasibility, and generating the final document.

Why AMD MI300X was a business requirement, not just a technical choice

Before explaining the architecture, this point deserves its own section because it is not just a technical choice — it is a business requirement. Manufacturing customers sign NDAs. Their STEP files contain proprietary geometry representing years of engineering work and millions of dollars in R&D. The hole pattern on a medical device or the pocket geometry on an aerospace component is confidential intellectual property.

Sending that data to OpenAI, Anthropic, or any commercial API endpoint is a confidentiality violation. Full stop. The AMD Instinct MI300X changes this equation completely. With 192GB of HBM3 VRAM and 5.3 TB/s of memory bandwidth, the team runs Qwen 2.5 7B Instruct entirely on-premise. No data leaves the shop's infrastructure. No STEP geometry is transmitted to a third-party server. The customer's IP stays where it belongs.

This is what "privacy by design" actually means in a manufacturing context — not a checkbox, but a fundamental architectural decision that makes the product viable for real enterprise customers. The alternative — routing proprietary CAD geometry through a cloud API — would disqualify the tool from any shop that deals with defense, medical, or aerospace parts, which is exactly the high-value segment the team wants to serve.

The agent architecture in detail

MachinaCheck uses a five-component pipeline. The first component is a STEP file parser built with cadquery, a Python library on top of OpenCASCADE. It performs mathematically exact feature extraction — no vision model, no OCR, no approximation. The parser returns:

All cylindrical holes with diameter and depth
Flat surfaces and their areas
Chamfers and fillets
Bounding box dimensions
Total volume and surface area

A Ø6.0mm hole is exactly Ø6.0mm in the output. This extraction is 100% accurate because it reads the mathematical geometry directly.

Agent 1, the operations classifier, runs Qwen 2.5 7B on AMD MI300X via vLLM. It takes the extracted geometry plus user inputs — material, tolerance, threads — and answers: "What CNC operations and tools are required to manufacture this part?" It applies manufacturing domain knowledge: Steel 304 needs carbide tooling. A cylindrical hole needs a drill, not an end mill. A tolerance of ±0.005mm requires a precision machine, not a standard mill.

Agent 2, the tool matcher, does not use an LLM at all. It queries the shop's tool inventory database and checks each required tool against what is available. Pure deterministic logic — database lookup, comparison, result. LLMs are not needed for database queries and using them here would add unnecessary latency and hallucination risk.

Agent 3, the feasibility decision agent, runs Qwen 2.5 7B again. It receives the match results and reasons about the overall situation, producing a structured decision such as:

"decision": "CONDITIONAL"
"confidence": "HIGH"
"reason": "All tools available except M10x1.5 tap"
"action_items": ["Purchase M10x1.5 tap ($15)"]
"risk_flags": ["Verify spindle speed for Steel 304"]
"estimated_setup_hours": 2.5

Agent 4, the report generator, also runs Qwen 2.5 7B. It synthesizes everything into a professional manufacturability report with an overall status, executive summary, part analysis, tools status, machine status, and final recommendation.

Running Qwen 2.5 on AMD MI300X

Running Qwen 2.5 7B on AMD Instinct MI300X via ROCm and vLLM was straightforward. The vLLM Quick Start image on AMD Developer Cloud has everything pre-configured. The launch command looks like this:

python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen2.5-7B-Instruct \
  --host 0.0.0.0 \
  --port 8000 \
  --dtype float16 \
  --gpu-memory-utilization 0.5

With gpu-memory-utilization set to 0.5 the system uses approximately 96GB of the available 192GB, leaving plenty of headroom. Inference latency for agent calls averages under 3 seconds. LangChain connects to vLLM via the OpenAI-compatible endpoint:

from langchain_community.llms import VLLMOpenAI

llm = VLLMOpenAI(
    openai_api_base="http://localhost:8000/v1",
    openai_api_key="EMPTY",
    model_name="Qwen/Qwen2.5-7B-Instruct",
    temperature=0.1,
    max_tokens=1000
)

The stack also includes cadquery, FastAPI, and Next.js, with the front end deployed on Hugging Face Spaces.

Results and what the team learned

Testing with real STEP files from GrabCAD produced consistent numbers across the board:

Feature extraction: under 1 second for parts with up to 50 features
Full pipeline (all 4 agents): 25 to 40 seconds end-to-end
Decision accuracy: correct manufacturability assessment on all test parts
Privacy: zero bytes of STEP geometry transmitted externally

The key takeaway from the build process is that LLMs should be used only where reasoning is needed. Agent 2 (tool matching) is pure Python, and putting an LLM there would be slower, more expensive, and less reliable. The right tool for a database lookup is a database query.

Prompt engineering for structured output also proved critical. Getting Qwen to reliably output valid JSON required careful rules in the prompt — explicitly stating that cylindrical holes need drills not end mills, that diameters must match exactly, that taps only appear when threads are specified.

The team also noted that AMD MI300X's 192GB VRAM means a much larger model could fit if needed. For a production deployment, Qwen 2.5 72B would fit comfortably and deliver significantly better reasoning quality, which opens a clear upgrade path without any hardware change.

Try it yourself

The project is live and publicly accessible:

HF Space: huggingface.co/spaces/lablab-ai-amd-developer-hackathon/MachinaCheck
GitHub: github.com/SyedMuhammadSarmad/Manufacturing-Agent

Upload any STEP file and see the full pipeline in action. The tool was built by Syed Muhammad Sarmad and Sabari Doss R at the AMD Developer Hackathon in May 2026. The complete stack is Qwen 2.5 7B · AMD Instinct MI300X · ROCm · vLLM · LangChain · cadquery · FastAPI · Next.js · Hugging Face Spaces.

Editorial SiliconFeed is an automated feed: facts are checked against sources; copy is normalized and lightly edited for readers.

Briefing

AMD Instinct MI300X

192GB HBM3 GPU with 5.3 TB/s memory bandwidth used to run Qwen 2.5 7B on-premise for MachinaCheck

Qwen 2.5 7B

Alibaba's open-weight language model powering three of the five MachinaCheck agents

MachinaCheck

Five-component multi-agent AI pipeline for CNC manufacturability analysis built at the AMD Developer Hackathon

cadquery

Python library built on OpenCASCADE used for mathematically exact STEP file feature extraction

vLLM

High-throughput LLM inference engine used to serve Qwen 2.5 7B on AMD MI300X via ROCm

LangChain

Orchestration framework that chains the five MachinaCheck agents via FastAPI

Syed Muhammad Sarmad

Co-builder of MachinaCheck at the AMD Developer Hackathon

Sabari Doss R

Co-builder of MachinaCheck at the AMD Developer Hackathon

FAQ

What does MachinaCheck do with a STEP file?

MachinaCheck ingests a STEP CAD file along with three inputs — material type, required tolerance, and any thread specifications — then runs a five-component pipeline to produce a full manufacturability report. The report covers whether the part can be made, what tools are needed, what is missing from inventory, and what actions to take before production starts.

Why does the system run on AMD MI300X instead of a cloud API?

STEP files contain proprietary geometry from customers who have signed NDAs — medical device hole patterns, aerospace pocket geometry, and similar confidential IP. Sending that data to OpenAI, Anthropic, or any commercial API endpoint would violate those agreements. The AMD Instinct MI300X with 192GB HBM3 VRAM lets the team run Qwen 2.5 7B entirely on-premise via ROCm and vLLM, so zero bytes of STEP geometry leave the shop.

Which models and tools are in the MachinaCheck stack?

The stack includes Qwen 2.5 7B Instruct (via vLLM on AMD MI300X), cadquery for STEP parsing on OpenCASCADE, LangChain for orchestration, FastAPI for the backend, and Next.js for the front end. The project is hosted on Hugging Face Spaces and GitHub under the AMD Developer Hackathon lablab.ai account.

More in the feed

Prepared by the editorial stack from public data and external sources.

Original article