Installation¶
This guide will help you install ComProScanner and its dependencies.
Requirements¶
- Python 3.12 or 3.13
- pip (Python package installer)
Basic Installation¶
The simplest way to install ComProScanner is using pip:
This will install the latest stable version from PyPI along with all required dependencies.
Installation from Source¶
If you want to install from source or contribute to development:
1. Clone the Repository¶
2. Install in Development Mode¶
The -e flag installs the package in editable mode, allowing you to make changes to the source code.
Environment Variables¶
ComProScanner requires several API keys or provider credentials depending on your workflow. Create a .env file in your project directory:
# Publisher TDM API Keys (for direct article access)
SCOPUS_API_KEY=your_scopus_api_key # for Elsevier as well as metadata retrieval
WILEY_API_KEY=your_wiley_api_key
SPRINGER_OPENACCESS_API_KEY=your_springer_openaccess_api_key # Springer provides two separate keys for open access and TDM API
SPRINGER_TDM_API_KEY=your_springer_tdm_api_key
IOP_papers_path=local_path_to_iop_papers # IOP Publishing provides XML articles in bulk through SFTP access
# API Keys for LLM Models (at least one is required which will be used for data extraction)
OPENAI_API_KEY=your_openai_api_key
DEEPSEEK_API_KEY=your_deepseek_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
GEMINI_API_KEY=your_gemini_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
# Hugging Face API Key (for accessing thellert/physbert_cased model for embeddings)
HF_TOKEN=your_huggingface_api_key
# Neo4j Configuration (for knowledge graph visualization)
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password
NEO4J_DATABASE=neo4j
Keep your API keys secure and never commit them to version control!
For a provider-by-provider guide on how to obtain these credentials, see the API Key Guide.
Optional Dependencies¶
For Additional LLM Providers¶
Depending on which LLM providers you want to use:
# For Anthropic Claude
pip install langchain-anthropic
# For Google Gemini
pip install langchain-google-genai
# For Ollama (local models)
pip install langchain-ollama
# For TogetherAI Model Integration
pip install langchain-together
# For OpenRouter Model Integration
pip install langchain-openrouter
# For Cohere Model Integration
pip install langchain-cohere
Verification¶
Verify your installation by running:
You should see the version number printed without any errors.
Upgrading¶
To upgrade to the latest version:
Troubleshooting¶
Common Issues¶
ImportError: No module named 'comproscanner'¶
Make sure you've installed the package correctly:
API Key Errors¶
Ensure your .env file is in the correct location and contains valid API keys.
Dependency Conflicts¶
If you encounter dependency conflicts, try creating a fresh virtual environment:
python -m venv compro_env
source compro_env/bin/activate # On Windows: compro_env\Scripts\activate
pip install comproscanner
Next Steps¶
Now that you have ComProScanner installed, check out the Quick Start Guide to begin extracting data from scientific articles.