File size: 5,252 Bytes
b415c56 d466b7d 17bbc1a d466b7d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
---
license: mit
language:
- en
- ko
tags:
- python
- cli
- markdown
- llm
- developer-tools
- code-analysis
- open-core
---
# Dir2md
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
> Transform your codebase into LLM-optimized markdown blueprints
Dir2md analyzes directory structures and generates comprehensive markdown documentation optimized for Large Language Models. It intelligently samples content, removes duplicates, and provides token-budget control to create the perfect context for AI-assisted development.
## β¨ Key Features
- **π― Smart Content Sampling**: Head/tail sampling with configurable token budgets
- **π Duplicate Detection**: SimHash-based deduplication to reduce noise
- **π‘οΈ Security First**: Built-in secret masking (basic OSS, advanced Pro)
- **π Multiple Output Modes**: Reference, summary, or full inline content
- **π§ Highly Configurable**: Extensive filtering and customization options
- **β‘ Developer Friendly**: Raw mode default for complete code visibility
## π Quick Start
### Installation
```bash
# From source (current)
git clone https://github.com/your-org/dir2md.git
cd dir2md
python -m src.dir2md.cli --help
# Coming soon: PyPI installation
pip install dir2md
```
### Basic Usage
```bash
# Generate project blueprint (developer-friendly raw mode)
dir2md .
# With basic security masking
dir2md . --masking basic
# Generate with manifest for CI/CD
dir2md . --emit-manifest --no-timestamp
# Token-optimized for LLM context
dir2md . --budget-tokens 4000 --preset iceberg
```
### Output Example
```markdown
# Project Blueprint
- Root: `/path/to/project`
- Generated: `2025-09-08 12:30:15`
- Preset: `raw`
- LLM mode: `inline`
- Estimated tokens (prompt): `6247`
## Directory Tree
[Complete file structure]
## Statistics
| Metric | Value |
|--------|-------|
| Total files | 42 |
| Estimated tokens | 6247 |
## File Contents
[Intelligently sampled content...]
```
## π Available Presets
| Preset | Description | Best For |
|--------|-------------|-----------|
| `raw` | Full content inclusion | Development, code review |
| `iceberg` | Balanced sampling | General documentation |
| `pro` | Advanced optimization | Large projects, LLM context |
## π Open-Core Model
### Free (OSS) Features
- Complete directory analysis
- Token optimization and sampling
- SimHash deduplication
- Basic security masking (3 patterns)
- All output modes and presets
- Deterministic builds
### Pro Features
- Advanced security masking (9+ patterns)
- Parallel processing & caching
- Language-specific analysis plugins
- HTML/PDF export options
- Team integration (CI/CD, PR bots)
- Priority support
[Learn more about Pro features](FEATURES.md)
## π Documentation
- **[Feature Comparison](FEATURES.md)** - Complete OSS vs Pro breakdown
- **[Current Status](CURRENT_FEATURES.md)** - What's implemented now
- **[Usage Examples](USAGE_EXAMPLES.md)** - Hands-on guide with examples
## π οΈ CLI Reference
```bash
# Basic options
dir2md [path] -o output.md --preset [iceberg|pro|raw]
# Token control
--budget-tokens 6000 # Total token budget
--max-file-tokens 1200 # Per-file token limit
--sample-head 120 # Lines from file start
--sample-tail 40 # Lines from file end
# Filtering
--include-glob "*.py,*.md" # Include patterns
--exclude-glob "test*,*.tmp" # Exclude patterns
--only-ext "py,js,ts" # File extensions only
# Security
--masking [off|basic|advanced] # Secret masking level
# Output
--emit-manifest # Generate JSON metadata
--no-timestamp # Reproducible output
--dry-run # Preview without writing
```
## π€ Contributing
We welcome contributions! Dir2md follows an open-core model:
- **Core functionality**: Open source (this repo)
- **Advanced features**: Commercial (separate repo)
- **Community**: All discussions welcome
### Development Setup
```bash
git clone https://github.com/your-org/dir2md.git
cd dir2md
python -m pytest -v # Run tests
python -m src.dir2md.cli . --dry-run # Test CLI
```
### Reporting Issues
- π **Bug reports**: [GitHub Issues](https://github.com/your-org/dir2md/issues)
- π‘ **Feature requests**: [GitHub Discussions](https://github.com/your-org/dir2md/discussions)
- π§ **Security issues**: [email protected]
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Pro features are available under a separate commercial license.
## π Why Dir2md?
Traditional documentation approaches fall short when working with AI assistants:
- **Too much noise**: Raw `tree` + `cat` includes irrelevant files
- **Token waste**: Unoptimized content hits LLM context limits
- **Security risks**: Accidental exposure of secrets and keys
- **No structure**: Difficult for AI to understand project layout
Dir2md solves these problems with intelligent analysis, sampling, and optimization specifically designed for the AI era.
---
*Made with β€οΈ for developers who want their AI to understand their code* |