Analysis Tool

The project includes a comprehensive Python-based analysis tool for verifying claims against Harvard’s Nuremberg Trials Project documents.

Installation

# Install dependencies
pip install requests beautifulsoup4 lxml

Usage

Basic Usage

# Run full analysis
python3 -m nuremberg_analysis.main

# Generate reports in specific directory
python3 -m nuremberg_analysis.main --output-dir ./my_reports

# Use cached documents only (skip web scraping)
python3 -m nuremberg_analysis.main --skip-scraping

# Focus on specific claims
python3 -m nuremberg_analysis.main --focus-claims pal_001 pal_002 yamashita_001

Command Line Options

--output-dir: Directory for output reports (default: ./reports)
--cache-dir: Directory for cached documents (default: ./nuremberg_cache)
--format: Output format - markdown, html, json, or all (default: all)
--rate-limit: Seconds to wait between HTTP requests (default: 1.0)
--skip-scraping: Skip web scraping, use cached documents only
--focus-claims: Only verify specific claim IDs

Tool Structure

nuremberg_analysis/
├── __init__.py              # Package initialization
├── chomsky_claims.py        # Claim extraction and structuring
├── document_scraper.py      # Web scraper for Harvard website
├── claim_verifier.py        # Claim verification logic
├── report_generator.py      # Report generation
├── main.py                  # Main analysis script
└── test_basic.py            # Basic test suite

Claims Analyzed

The tool analyzes 20 claims across 8 categories:

Tokyo Trials - General Yamashita (1 claim)
Tokyo Trials - Justice Pal Dissent (5 claims)
Nuremberg Principles - Operational Criterion (4 claims)
Nuremberg Trials - Admiral Gernetz Case (1 claim)
Nuremberg Principles - Telford Taylor (4 claims)
Nuremberg Principles - Ex Post Facto (1 claim)
Tokyo Trials - General Assessment (3 claims)
American Presidents - Context (1 claim)

Report Format

Reports include:

Executive Summary: Overview of verification status
Detailed Results: Claim-by-claim verification with:
- Claim text and context
- Verification status (verified/partially verified/contradicted/insufficient evidence)
- Confidence level
- Supporting evidence excerpts
- Contradicting evidence excerpts
- Citations and references
Methodology: Description of analysis approach
Citations: Complete list of document sources

Example Output

After running the analysis, you’ll get:

reports/
├── chomsky_verification_report_YYYYMMDD_HHMMSS.md
├── chomsky_verification_report_YYYYMMDD_HHMMSS.html
└── chomsky_verification_report_YYYYMMDD_HHMMSS.json

nuremberg_cache/
└── [cached HTML documents]

Features

Comprehensive: Analyzes all 20 claims from Chomsky’s essay
Automated: Automated document search and verification
Cached: Documents are cached for offline analysis
Multiple Formats: Reports in Markdown, HTML, and JSON
Respectful: Rate-limited requests to be respectful to Harvard’s servers
Well-Documented: Comprehensive documentation and examples

Limitations

Website Structure: The scraper depends on the Harvard website structure - changes may require updates
Document Access: Some documents may require special access or may not be fully digitized
External Sources: Some claims (e.g., about Telford Taylor’s book) may require consulting external sources
Automated Analysis: This is an automated tool. Full verification may require manual review of original documents

Ethical Considerations

This tool analyzes historical documents related to war crimes trials. Users should:

Be aware that documents contain graphic descriptions of violence and genocide
Use the tool responsibly and verify claims through original sources
Understand that automated analysis has limitations
Consult academic sources for comprehensive understanding