Metadata Collection¶
The metadata collection module helps you find and filter metadata for relevant scientific articles from Scopus database using Scopus Search API based on query property keywords.
Basic Usage¶
from comproscanner import ComProScanner
# Initialize scanner
scanner = ComProScanner(main_property_keyword="piezoelectric")
# Collect metadata
scanner.collect_metadata()
Parameters¶
Required Parameters¶
main_property_keyword (str)¶
The main property of interest for your research. This keyword will be used to generate search queries for metadata collection.
Optional Parameters¶
base_queries (list)¶
List of base search queries related to the main property. If not provided, the main property keyword will be used as the sole base query.
extra_queries (list)¶
List of additional search queries to expand the search scope.
start_year (int)¶
Starting publication year for filtering articles. It must be bigger than end_year as the search is performed backwards in time.
end_year (int)¶
Ending publication year for filtering articles.
Default Values
base_queries = Noneextra_queries = Nonestart_year = current yearend_year = current year - 2
Advanced Examples¶
Example 1: Broad Property Search¶
scanner = ComProScanner(main_property_keyword="magnetic")
scanner.collect_metadata(
base_queries=[
"magnetic",
"magnetism",
"ferromagnetic",
"antiferromagnetic"
],
extra_queries=[
"materials",
"thin films",
"nanoparticles"
]
)
Example 2: Recent Publications Only¶
from datetime import datetime
current_year = datetime.now().year
scanner.collect_metadata(
base_queries=["superconductivity"],
start_year=current_year,
end_year=current_year - 1 # Last year only
)
Output Format¶
Similar to the following example, metadata for all relevant articles is stored in a CSV file:
Next Steps¶
- Learn about Article Processing
- Explore RAG configuration