Creating simple genealogy timelines with AI works well, but family history research often requires more complex approaches. You need timelines spanning multiple generations, combining different family groups, or analysing extensive descendancy data.
My experiments with complex genealogy timelines revealed significant differences between AI tools. Some handle complexity brilliantly, while others fail completely. This post shares my findings and shows you how to create useful complex timelines for research planning and problem-solving.
Quick Recap: Where We Left Off
In my previous post, I showed you how AI can create simple timelines for individuals and their immediate families. The key findings were:
- Format matters: Structured reports from family history software work better than web-based PDFs as input data
- Instructions are crucial: Clear prompts minimise common interpretation errors
- Quality varies by tool: Different AI tools have different strengths and weaknesses.
For simple timelines, all three tested tools (ChatGPT 4o, Google Gemini 2.0, and Claude Sonnet) performed well with properly formatted data. But complexity changes everything.
Why Complex Timelines Matter
Complex genealogy timelines serve different research purposes than simple ones:
- Multi-generational analysis: Track family patterns across time and place
- Problem-solving: Identify inconsistencies, conflicts and gaps in your research or the data
- Research planning: Visualise what you know versus what you need to find
- Family group studies: Understand relationships between families.
The timelines available in online family trees and family history software are limited. You get basic birth-death information, usually in PDF format only. For serious research, you need customisable timelines with complete event data in editable formats.
My Systematic Testing Approach
I needed to understand where each AI tool reaches its limits. My goal was finding a process that generates accurate timelines efficiently, without a lot of error correction.
I tested increasingly complex document formats to identify the breaking point where AI makes too many mistakes. When a tool struggled, I tried alternative approaches to see if different methods resolved the problems.
AI Tools Tested:
- ChatGPT 4o (paid)
- Google Gemini 2.0 (free)
- Claude Sonnet (paid)
Document Formats Tested:
- Family tree chart: Legacy Family Tree PDF, five generations, 28 people
- Medium complexity report: Legacy descendancy report PDF, 11 pages, 3 generations, 47 people, 159 events
- High complexity report: Legacy descendancy report PDF, 15 pages, 3 generations, 238 people, 249 events
I used the same prompt from my simple timeline experiments, with modifications based on those findings. I added a unique identifier column to distinguish between people of the same name. And I provided specific guidance for interpreting residence events, as errors were common for this event type.
Results: The Complexity Breaking Point
| Document Type | ChatGPT 4o | Google Gemini | Claude Sonnet |
| Family tree chart PDF | Failed | Failed | Failed |
| Medium complexity report | Variable results | 100% success | 100% success |
| High complexity report | Failed | 100% success | 90% success |
Family Tree Charts: Universal Failure
Family tree charts in PDF format proved impossible for all three tools. ChatGPT extracted no usable data. Gemini and Claude extracted some correct information but omitted people, missed events, and included incorrect data.
Even when I provided examples of correct interpretations, Claude couldn’t apply those examples consistently across the chart. The visual layout and graphical elements appear to make these documents unsuitable for AI timeline creation. However, I will be conducting more tests as it would be particularly useful if successful.

Medium Complexity: Clear Winners Emerge
This is where tool differences became apparent:
ChatGPT produced variable results. It managed one 2-generation, 9-page report successfully but omitted residence events. Other similar reports contained numerous errors, making results unreliable.
Gemini and Claude both produced high-quality timelines consistently. Gemini included some unrequested event types, but the core timeline data was accurate.
High Complexity: The Real Test
ChatGPT struggled even when I broke complex reports into smaller parts, feeding it one generation at a time. The error rate remained unacceptably high for research purposes.
Gemini maintained 100% success rate even with the most complex documents, though it sometimes included event types not requested in the prompt.
Claude achieved 90% success rate. It occasionally missed events but responded well to correction prompts, fixing errors when they were pointed out.

Common Problems and Solutions
Understand Error Patterns
Errors often follow discernible patterns. In my experiments these included:
- All events before a certain year omitted
- All marriages excluded
- Data appearing in wrong columns when locations weren’t specified
- Events omitted when separated from names by page breaks.
Identifying these patterns helps you prompt AI for specific corrections and may help you improve the data input for future situations.
Input Data Issues
Sometimes the problem isn’t AI interpretation but your source data. I discovered Legacy Family Tree had started omitting marriage dates from reports, which explained why both Claude and Gemini excluded these events. Always verify your input data quality first.
Quality Control Strategy
Check AI output systematically:
- Download the timeline into Excel
- Apply filters by person’s name
- Review events for each individual separately
- Look for obvious gaps or inconsistencies.
If quality control takes excessive time, the AI tool isn’t worth using. High error rates defeat the time-saving purpose.
Optimising Your Approach
Adjust Your Prompting
For complex timelines, I made these prompt modifications:
- Added unique identifier column requirements
- Provided specific residence event interpretation guidance
- Included examples of correct date handling for approximate dates.
When asking for error corrections, describe the problem and provide 1-2 specific examples. If AI struggles, ask it to explain the difficulty and discuss alternative approaches.
Find the Complexity Sweet Spot
More data doesn’t always mean better results. Increasing complexity can increase error rates and processing time, negating time savings.
My testing for these data formats suggests the complexity limit lies around 160-250 events. Beyond this, error rates increase significantly. Gemini handles higher complexity better than the other tools, but I wouldn’t exceed 250 events in a single request.
Optimisation strategies:
- Exclude unnecessary data (such as indexes)
- Break very large datasets into logical chunks
- Focus on event types that are essential for your research objectives.
Format Considerations
- Text-based PDFs work best: Structured reports from family history software
- Avoid image-based documents: Family tree charts, scanned documents
- Consistent formatting helps: Improve data entry in your family tree.
Practical Recommendations
For medium complexity projects (under 160 events): Gemini or Claude both work well. Choose based on your preferences.
For high complexity projects (160-250 events): Gemini shows superior performance, but Claude works well with careful quality control.
Avoid ChatGPT for complex genealogy timelines. Its inconsistent performance makes it unsuitable for research purposes.
Always prepare fallback approaches: If your preferred tool struggles with specific data, try alternative formatting or different tools.
What This Means for Your Research
AI can definitely accelerate complex timeline creation, but success requires:
- Realistic expectations: Understand your chosen tool’s limitations
- Proper preparation: Use structured, text-based input data
- Strategic complexity management: Stay within the 160-250 event range (or 11-15 A4 pages)
- Systematic quality control: Check results methodically.
Complex AI-generated timelines can support your family history research, but they’re not magic solutions. They require the same critical thinking and verification you’d apply to any research tool.
Looking Ahead
My experiments continue, focusing on:
- Including source citations in AI-generated timelines
- Using timelines for specific genealogical problem-solving scenarios
- Optimising prompts for different research purposes.
In another post, I’ll explore real-world case studies using AI timelines to solve specific genealogical problems.
The key takeaway: AI can handle complex genealogy timelines effectively, but tool selection and proper preparation determine success.
If you’re not already using family history software, Legacy Family Tree is free.
If you want to learn more about using timelines in genealogy, start with this post: Use Genealogy Timelines to Organise, Analyse and Improve Your Research.
About the Author
Danielle Lautrec is a genealogy educator, researcher, and author of The Good Genealogist. With qualifications in history, family history, and historical archaeology, she teaches for the Society of Australian Genealogists.


1 thought on “Complex Genealogy Timelines with AI”