By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: From Black Box to Blueprint: Thoughtworks Uses Generative AI to Extract Legacy System Functionality
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > From Black Box to Blueprint: Thoughtworks Uses Generative AI to Extract Legacy System Functionality
News

From Black Box to Blueprint: Thoughtworks Uses Generative AI to Extract Legacy System Functionality

News Room
Last updated: 2025/09/16 at 4:46 AM
News Room Published 16 September 2025
Share
SHARE

Thoughtworks consultants recently described an experiment that applied generative AI to a legacy system with no available source code. 

The article, shared on Martin Fowler’s blog, highlighted a pilot where a five-person team analyzed the system’s database, UI, and binaries in parallel.

InfoQ reached out to the authors, Thiyagu Palanisamy and Chandirasekar Thiagarajan, who explained that during the two-week pilot the team used Gemini 2.5 Pro to analyse a thin slice of what was an enormous legacy system. The output of that analysis was a functional specification — a “blueprint” of the black-box system that domain experts were able to validate.

AI proved most effective in decoding code, summarizing binaries, and mapping database changes, while also easing schema discovery. 

AI made a significant difference in reverse engineering the ASM code. Traditional approaches would have taken months to decode the logic specified ASM and also to identify the system functions vs business functionality.

The exercise demonstrated how AI can accelerate reverse engineering, providing insights into legacy systems at a pace difficult to achieve through manual methods alone.

Enterprises often rely on critical systems that have become opaque after many years of use. Documentation is incomplete, source code may be missing, and institutional knowledge erodes over time. 

The article frames this as the “black box” problem: the system works, but its internal rules are hidden. The goal is not to regenerate code but to reconstruct a “blueprint” of functional intent that can inform modernization with lower risk.

The pilot combined several techniques. One strand focused on connecting dots across data sources by correlating what could be observed in the UI, database schema, and runtime behavior. Another applied change data capture to trace how specific user actions triggered mutations in the database.

Change Data Capture Methodology (source: martinfowler.com)

From there, the team attempted server logic inference by linking database activity with binary calls. This extended into what they describe as AI-assisted binary archaeology, where decompilation tools and large language models helped summarize functions and propose candidate responsibilities.

The process was iterative, involving steps such as finding relevant functions, building subtrees, validating entry points, and assembling specifications from fragments into coherent functionality. 

At each stage, AI provided speed by generating summaries, highlighting relationships, or drafting candidate rules, while humans validated the results across perspectives. 

Inferred Logic Spec (source: martinfowler.com)

When domain experts reviewed the output, they confirmed it captured behavior accurately enough to serve as a reliable reference point. The authors told InfoQ they had high confidence the approach could scale across the broader system, provided continuity of the core team and its accumulated context.

The authors also noted that the same techniques have since been applied to other client engagements, providing significant acceleration in building context with or without access to source code.

The experiment also revealed challenges. While AI accelerated many steps, the models were not always reliable, with risks of hallucinations, false positives, and gaps in coverage. Each hypothesis needed confirmation from other evidence before being accepted.

Validation was critical. Cross-checks between data sources and domain expert reviews ensured that draft specifications were accurate, keeping speed from undermining trust.

The pilot illustrates both promise and limits for architects considering AI-assisted reverse engineering. The approach showed that AI can help correlate evidence across UI, databases, and binaries, producing draft specifications that domain experts could validate. 

Off the back of the team’s encouraging results, InfoQ spoke with the authors, Thiyagu and Chandirasekar, to learn more about the setup of the pilot, and their reflections on the technique’s potential.

 

InfoQ: How long did the pilot take, and how many people were involved?

About 2 weeks. We had about 5 folks involved in parallel, focusing on extracting context from 3 different areas: DB, Application UI, Binaries. We analysed one of the 24 business domains, including 650 tables, 1,200 stored procedures, 350 user screens and 45 compiled DLLs

InfoQ: Beyond the thin slice pilot, what confidence did the team have that the approach would scale across the full system? 

Pretty high as long as we have the people with context preserved as a core team. We had the knowledge and techniques pinned down, and familiarity with the problem and domain, after the experiment with the thin slice.

InfoQ: Has this method since been applied to other client engagements?

Yes, this has been applied to similar engagements where we need to build context of the legacy systems with or without the source code. This approach has provided us with significant acceleration.

InfoQ: How did the AI-generated specification provide tangible value to the client? 

We walked them through with detailed specifications of the thin slice, which gave them confidence to take up this initiative. Using the same approach we also identified high level capabilities of the overall system, which helped them to build a much deeper understanding of the overall system than before.

InfoQ: Were there specific moments where the AI made a material difference versus traditional reverse engineering? 

AI made a significant difference in reverse engineering the ASM code. Traditional approaches would have taken months to decode the logic specified ASM and also to identify the system functions vs business functionality.

InfoQ: What were the most significant pitfalls?

One key pitfall we observed is that AI performs best at a detailed level. When asked to process very large amounts of context, it tends to hallucinate. We also saw instances of positive reinforcement bias, where the model generated overly optimistic or false-positive outputs. Our takeaway is to use AI for fine-grained analysis and build the broader context outside the model, where we can validate and synthesize insights

InfoQ: How did the team handle validation: what governance or review practices ensured that the AI’s output was trustworthy?

We handled validation by breaking the work into smaller steps and adding detailed lineage at each stage. This allowed us to cross-check and confirm every output before incorporating it into a larger context block. By validating incrementally, we ensured that the overall result remained trustworthy and consistent

InfoQ: How do you see this approach evolving in the next few years — are there toolchains or practices you’d like to see emerge?

We anticipate a new generation of toolchains that make context ingestion and consolidation almost effortless, with MCP server style wrappers seamlessly orchestrating existing reverse engineering tools. Beyond that, we envision AI becoming a native capability within these tools, enabling near real time insights as engineers explore complex systems. Perhaps most transformative will be collaborative context building, where multiple stakeholders can co create, validate, and evolve system blueprints in real time, dramatically reducing the cycle time from discovery to decision making.

InfoQ: What advice would you give to someone in a similar position who wants to try this on their own legacy estate?

Pick a manageable slice, experiment, and let the learnings inspire the next step toward modernizing your legacy estate

 

 

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article How to plan for Africa’s next technology decade
Next Article ‘A New Story’: Apple Store in Tokyo Returning to Original 2003 Location
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Remote Work Reality Check: Malta, Madeira and the Canaries | HackerNoon
Computing
Epson’s new Lifestudio projectors put Google TV and Bose sound in one package
News
US Tech Giants Race to Spend Billions in UK AI Push
Gadget
Robotics Funding Crests Higher As Figure Lands Another $1B
News

You Might also Like

News

Epson’s new Lifestudio projectors put Google TV and Bose sound in one package

4 Min Read
News

Robotics Funding Crests Higher As Figure Lands Another $1B

4 Min Read
News

Humanoid robot startup Figure raises $1B+ at $39B valuation – News

5 Min Read
News

Today's NYT Connections Hints, Answers for Sept. 17, #829

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?