Complex Genealogy Timelines with AI

Creating simple genealogy timelines with AI works well, but family history research often requires more complex approaches. You need timelines spanning multiple generations, combining different family groups, or analysing extensive descendancy data.

My experiments with complex genealogy timelines revealed significant differences between AI tools. Some handle complexity brilliantly, while others fail completely. This post shares my findings and shows you how to create useful complex timelines for research planning and problem-solving.

Quick Recap: Where We Left Off

In my previous post, I showed you how AI can create simple timelines for individuals and their immediate families. The key findings were:

Format matters: Structured reports from family history software work better than web-based PDFs as input data
Instructions are crucial: Clear prompts minimise common interpretation errors
Quality varies by tool: Different AI tools have different strengths and weaknesses.

For simple timelines, all three tested tools (ChatGPT 4o, Google Gemini 2.0, and Claude Sonnet) performed well with properly formatted data. But complexity changes everything.

Why Complex Timelines Matter

Complex genealogy timelines serve different research purposes than simple ones:

Multi-generational analysis: Track family patterns across time and place
Problem-solving: Identify inconsistencies, conflicts and gaps in your research or the data
Research planning: Visualise what you know versus what you need to find
Family group studies: Understand relationships between families.

The timelines available in online family trees and family history software are limited. You get basic birth-death information, usually in PDF format only. For serious research, you need customisable timelines with complete event data in editable formats.

My Systematic Testing Approach

I needed to understand where each AI tool reaches its limits. My goal was finding a process that generates accurate timelines efficiently, without a lot of error correction.

I tested increasingly complex document formats to identify the breaking point where AI makes too many mistakes. When a tool struggled, I tried alternative approaches to see if different methods resolved the problems.

AI Tools Tested:

ChatGPT 4o (paid)
Google Gemini 2.0 (free)
Claude Sonnet (paid)

Document Formats Tested:

Family tree chart: Legacy Family Tree PDF, five generations, 28 people
Medium complexity report: Legacy descendancy report PDF, 11 pages, 3 generations, 47 people, 159 events
High complexity report: Legacy descendancy report PDF, 15 pages, 3 generations, 238 people, 249 events

I used the same prompt from my simple timeline experiments, with modifications based on those findings. I added a unique identifier column to distinguish between people of the same name. And I provided specific guidance for interpreting residence events, as errors were common for this event type.

Results: The Complexity Breaking Point

Document Type	ChatGPT 4o	Google Gemini	Claude Sonnet
Family tree chart PDF	Failed	Failed	Failed
Medium complexity report	Variable results	100% success	100% success
High complexity report	Failed	100% success	90% success

Family Tree Charts: Universal Failure

Family tree charts in PDF format proved impossible for all three tools. ChatGPT extracted no usable data. Gemini and Claude extracted some correct information but omitted people, missed events, and included incorrect data.

Even when I provided examples of correct interpretations, Claude couldn’t apply those examples consistently across the chart. The visual layout and graphical elements appear to make these documents unsuitable for AI timeline creation. However, I will be conducting more tests as it would be particularly useful if successful.

extract of a family tree chart showing seven individuals across three generations — Family tree charts are difficult for AI to interpret accurately

Medium Complexity: Clear Winners Emerge

This is where tool differences became apparent:

ChatGPT produced variable results. It managed one 2-generation, 9-page report successfully but omitted residence events. Other similar reports contained numerous errors, making results unreliable.

Gemini and Claude both produced high-quality timelines consistently. Gemini included some unrequested event types, but the core timeline data was accurate.

High Complexity: The Real Test

ChatGPT struggled even when I broke complex reports into smaller parts, feeding it one generation at a time. The error rate remained unacceptably high for research purposes.

Gemini maintained 100% success rate even with the most complex documents, though it sometimes included event types not requested in the prompt.

Claude achieved 90% success rate. It occasionally missed events but responded well to correction prompts, fixing errors when they were pointed out.

extract from a genealogy timeline showing events that were not requested for inclusioin — Gemini added event types that were not requested, which is not as bad as omitting events, but not ideal.

Common Problems and Solutions

Understand Error Patterns

Errors often follow discernible patterns. In my experiments these included:

All events before a certain year omitted
All marriages excluded
Data appearing in wrong columns when locations weren’t specified
Events omitted when separated from names by page breaks.

Identifying these patterns helps you prompt AI for specific corrections and may help you improve the data input for future situations.

Input Data Issues

Sometimes the problem isn’t AI interpretation but your source data. I discovered Legacy Family Tree had started omitting marriage dates from reports, which explained why both Claude and Gemini excluded these events. Always verify your input data quality first.

Quality Control Strategy

Check AI output systematically:

Download the timeline into Excel
Apply filters by person’s name
Review events for each individual separately
Look for obvious gaps or inconsistencies.

If quality control takes excessive time, the AI tool isn’t worth using. High error rates defeat the time-saving purpose.

Optimising Your Approach

Adjust Your Prompting

For complex timelines, I made these prompt modifications:

Added unique identifier column requirements
Provided specific residence event interpretation guidance
Included examples of correct date handling for approximate dates.

When asking for error corrections, describe the problem and provide 1-2 specific examples. If AI struggles, ask it to explain the difficulty and discuss alternative approaches.

Find the Complexity Sweet Spot

More data doesn’t always mean better results. Increasing complexity can increase error rates and processing time, negating time savings.

My testing for these data formats suggests the complexity limit lies around 160-250 events. Beyond this, error rates increase significantly. Gemini handles higher complexity better than the other tools, but I wouldn’t exceed 250 events in a single request.

Optimisation strategies:

Exclude unnecessary data (such as indexes)
Break very large datasets into logical chunks
Focus on event types that are essential for your research objectives.

Format Considerations

Text-based PDFs work best: Structured reports from family history software
Avoid image-based documents: Family tree charts, scanned documents
Consistent formatting helps: Improve data entry in your family tree.

Practical Recommendations

For medium complexity projects (under 160 events): Gemini or Claude both work well. Choose based on your preferences.

For high complexity projects (160-250 events): Gemini shows superior performance, but Claude works well with careful quality control.

Avoid ChatGPT for complex genealogy timelines. Its inconsistent performance makes it unsuitable for research purposes.

Always prepare fallback approaches: If your preferred tool struggles with specific data, try alternative formatting or different tools.

What This Means for Your Research

AI can definitely accelerate complex timeline creation, but success requires:

Realistic expectations: Understand your chosen tool’s limitations
Proper preparation: Use structured, text-based input data
Strategic complexity management: Stay within the 160-250 event range (or 11-15 A4 pages)
Systematic quality control: Check results methodically.

Complex AI-generated timelines can support your family history research, but they’re not magic solutions. They require the same critical thinking and verification you’d apply to any research tool.

Looking Ahead

My experiments continue, focusing on:

Including source citations in AI-generated timelines
Using timelines for specific genealogical problem-solving scenarios
Optimising prompts for different research purposes.

In another post, I’ll explore real-world case studies using AI timelines to solve specific genealogical problems.

The key takeaway: AI can handle complex genealogy timelines effectively, but tool selection and proper preparation determine success.

If you’re not already using family history software, Legacy Family Tree is free.

If you want to learn more about using timelines in genealogy, start with this post: Use Genealogy Timelines to Organise, Analyse and Improve Your Research.

About the Author

Danielle Lautrec is a genealogy educator, researcher, and author of The Good Genealogist. With qualifications in history, family history, and historical archaeology, she teaches for the Society of Australian Genealogists.

Quick Recap: Where We Left Off

Why Complex Timelines Matter