Analysis Tool
The project includes a comprehensive Python-based analysis tool for verifying claims against Harvard’s Nuremberg Trials Project documents.
Installation
# Install dependencies
pip install requests beautifulsoup4 lxml
Usage
Basic Usage
# Run full analysis
python3 -m nuremberg_analysis.main
# Generate reports in specific directory
python3 -m nuremberg_analysis.main --output-dir ./my_reports
# Use cached documents only (skip web scraping)
python3 -m nuremberg_analysis.main --skip-scraping
# Focus on specific claims
python3 -m nuremberg_analysis.main --focus-claims pal_001 pal_002 yamashita_001
Command Line Options
--output-dir: Directory for output reports (default:./reports)--cache-dir: Directory for cached documents (default:./nuremberg_cache)--format: Output format -markdown,html,json, orall(default:all)--rate-limit: Seconds to wait between HTTP requests (default: 1.0)--skip-scraping: Skip web scraping, use cached documents only--focus-claims: Only verify specific claim IDs
Tool Structure
nuremberg_analysis/
├── __init__.py # Package initialization
├── chomsky_claims.py # Claim extraction and structuring
├── document_scraper.py # Web scraper for Harvard website
├── claim_verifier.py # Claim verification logic
├── report_generator.py # Report generation
├── main.py # Main analysis script
└── test_basic.py # Basic test suite
Claims Analyzed
The tool analyzes 20 claims across 8 categories:
- Tokyo Trials - General Yamashita (1 claim)
- Tokyo Trials - Justice Pal Dissent (5 claims)
- Nuremberg Principles - Operational Criterion (4 claims)
- Nuremberg Trials - Admiral Gernetz Case (1 claim)
- Nuremberg Principles - Telford Taylor (4 claims)
- Nuremberg Principles - Ex Post Facto (1 claim)
- Tokyo Trials - General Assessment (3 claims)
- American Presidents - Context (1 claim)
Report Format
Reports include:
- Executive Summary: Overview of verification status
- Detailed Results: Claim-by-claim verification with:
- Claim text and context
- Verification status (verified/partially verified/contradicted/insufficient evidence)
- Confidence level
- Supporting evidence excerpts
- Contradicting evidence excerpts
- Citations and references
- Methodology: Description of analysis approach
- Citations: Complete list of document sources
Example Output
After running the analysis, you’ll get:
reports/
├── chomsky_verification_report_YYYYMMDD_HHMMSS.md
├── chomsky_verification_report_YYYYMMDD_HHMMSS.html
└── chomsky_verification_report_YYYYMMDD_HHMMSS.json
nuremberg_cache/
└── [cached HTML documents]
Features
- Comprehensive: Analyzes all 20 claims from Chomsky’s essay
- Automated: Automated document search and verification
- Cached: Documents are cached for offline analysis
- Multiple Formats: Reports in Markdown, HTML, and JSON
- Respectful: Rate-limited requests to be respectful to Harvard’s servers
- Well-Documented: Comprehensive documentation and examples
Limitations
- Website Structure: The scraper depends on the Harvard website structure - changes may require updates
- Document Access: Some documents may require special access or may not be fully digitized
- External Sources: Some claims (e.g., about Telford Taylor’s book) may require consulting external sources
- Automated Analysis: This is an automated tool. Full verification may require manual review of original documents
Ethical Considerations
This tool analyzes historical documents related to war crimes trials. Users should:
- Be aware that documents contain graphic descriptions of violence and genocide
- Use the tool responsibly and verify claims through original sources
- Understand that automated analysis has limitations
- Consult academic sources for comprehensive understanding