Notice: This project is in experimental stage. It works, but there are a lot of things to be done better. Please use it and create Issues with your problems.
Install Docker and Docker Compose for your platform. Docker for Mac or Docker for Windows will install both tools for you, if you are on either of these environments. It is however, recommended to run directly on Linux, for native container support, and less issues overall.
Then, clone this repo and setup the environment:
git clone https://github.com/metacpan/metacpan-docker.git
cd metacpan-docker
bin/metacpan-docker init
After this, you can run both the metacpan-web frontend on
http://localhost:5001 and the metacpan-api backend on
http://localhost:5000, with ElasticSearch on http://localhost:9200, via
bin/metacpan-docker localapi up
This will build the Docker images for the MetaCPAN and ElasticSearch services (which will take a while, especially on a fresh first time install of Docker,) and run the services. You'll know when they're ready when the services start listening on the ports listed above.
Don't forget to seed the local metacpan-api with a partial CPAN; run
the following command in a separate terminal to get you up to speed:
bin/metacpan-docker localapi exec api index-cpan.sh
This will prompt you to confirm removing old indices and setting up
mappings on the ElasticSearch service (say YES) then proceed to rsync
a partial CPAN in /CPAN for its metadata to be imported.
Once the above is done, you should be able to see your local partial CPAN data in e.g. http://localhost:5001/recent and elsewhere.
Alternatively, if you just want to hack on the web frontend, you can run this instead of all the above:
docker-compose up
From here, you can proceed and hack on the MetaCPAN code at
src/metacpan-api and/or src/metacpan-web directories, and saving
edits will reload the corresponding apps automatically!
When done hacking (or, more likely, when you need to rebuild/refresh your Docker environment) you can then run
bin/metacpan-docker localapi down
# or, if running the metacpan-web service only
docker-compose down
in another terminal to stop all MetaCPAN services and remove the containers.
For further details, read on!
The system consists of several services that live in docker containers:
web— the web interface on http://localhost:5001api— the main server on http://localhost:5000elasticsearch— database on http://localhost:9200elasticsearch_test— test database on http://localhost:9300
These services use one or more Docker volumes:
metacpan_cpan: holds the CPAN archive, mounted in/CPANmetacpan_elasticsearch: holds the ElasticSearch database filesmetacpan_elasticsearch_test: holds the ElasticSearch test database filesmetacpan_api_cartonandmetacpan_web_carton: holds the dependencies installed by Carton for theapiandwebservices, respectively; mounted on/cartoninstead oflocal, to prevent clashing with the host user's Carton
Docker Compose is used to, uh, compose them all together into one
system. Using docker-compose directly is a mouthful, however, so
putting this all together is done via the bin/metacpan-docker script
to simplify setup and usage (and to get you started hacking on the
MetaCPAN sooner!)
bin/metacpan-docker is a thin wrapper for the docker-compose
command, providing the environment variables necessary to run a basic
MetaCPAN environment. It provides these subcommands:
The init subcommand basically clones the metacpan-api
and metacpan-web repositories, and sets up the git commit hooks for
each of them, in preparation for future docker-compose or
bin/metacpan-docker localapi commands.
The localapi subcommand adds the necessary configuration for
docker-compose to run both the metacpan-web and metacpan-api
services, along with elasticsearch and Docker volumes. Under the
hood, it customizes the COMPOSE_FILE and COMPOSE_PROJECT_NAME
environment variables used by docker-compose to use additional YAML
configuration files aside from the default docker-compose.yml.
As noted earlier, bin/metacpan-docker is a thin wrapper to
docker-compose, so commands like up, down, and run will work as
expected from docker-compose. See the docker-compose docs for an
overview of available commands.
The web service is a checkout of metacpan-web, built as a Docker
image. Running this service alone is enough if you want to just hack on
the frontend, since by default the service is configured to talk to
https://fastapi.metacpan.org for its backend; if this is what you want,
then you can simply invoke docker-compose up.
The api service is a checkout of metacpan-api, built as a Docker
image, just like the web service.
If using this service to run a local backend, you will need to run some
additional commands in a separate terminal once bin/metacpan-docker localapi up runs:
Running
bin/metacpan-docker localapi exec api partial-cpan-mirror.sh
will rsync modules selected CPAN authors, plus the package and author
indices, into the api service's /CPAN directory. This is nearly
equivalent to the same script in the metacpan-developer repository.
Running
bin/metacpan-docker localapi exec api bin/run bin/metacpan mapping --delete
bin/metacpan-docker localapi exec api bin/run bin/metacpan release /CPAN/authors/id
bin/metacpan-docker localapi exec api bin/run bin/metacpan latest
bin/metacpan-docker localapi exec api bin/run bin/metacpan author
in sequence will create the indices and mappings in the elasticsearch
service, and import the /CPAN data into elasticsearch.
If you're impatient or lazy to do all the above, just running
bin/metacpan-docker localapi exec api index-cpan.sh
instead will set it all up for you.
The elasticsearch and elasticsearch_test services uses the
official ElasticSearch Docker image, configured with settings and
scripts taken from the metacpan-puppet repository. It is depended
on by the api service.
Suppose you have a local minicpan in /home/ftp/pub/CPAN. If you would
like to use this in metacpan-docker, then edit the
docker-compose.localapi.yml to change the api service's volume
mounts to use your local minicpan as /CPAN, e.g.:
services:
api:
volumes:
- /home/ftp/pub/CPAN:/CPAN
...Note that if you want CPAN author data indexed into ElasticSearch, your
minicpan should include authors/00whois.xml. Full indexing would take
a better part of a day or two, depending on your hardware.
Use bin/metacpan-docker run and similar:
# Run tests for metacpan-web against fastapi.metacpan.org
bin/metacpan-docker exec web bin/prove
# Run tests for metacpan-web against local api
bin/metacpan-docker localapi exec web bin/prove
# Run tests for metacpan-api against local elasticsearch_test
bin/metacpan-docker localapi exec api bin/prove
Because both the api and web services are running inside
clean Perl containers, it is possible to maintain a clean set of
Carton dependencies independent of your host machine's perl. Just
update the cpanfile of the project, and run
bin/metacpan-docker exec web carton install
# or
bin/metacpan-docker exec api carton install
Due to the way the Compose services are configured, these commands will
update the corresponding cpanfile.snapshot safely, even if you do or
don't have a local directory (internally, the containers' local
directory is placed in /carton instead, to prevent interfering with
the host user's own local Carton directory.)
By default, the docker-compose.localapi.yml configures the
elasticsearch service to listen on the Docker host at
http://localhost:9200, and is also accessible via the Docker default
network address of http://172.17.0.1:9200; you can inspect it via simple
curl or wget requests, or use a Kibana container, e.g.
docker run --rm -p 5601:5601 -e ELASTICSEARCH_URL=http://172.17.0.1:9200 -it kibana:4.6
Running the above will provide a Kibana container at
http://localhost:5601, which you can configure to have it read the
cpan* index in the elasticsearch service.
It is also certainly possible to run Kibana as part of the compose
setup, by configuring e.g. a kibana service.
- Integrate other MetaCPAN services (e.g. github-meets-cpan)
- Add more Tips and tricks (as we continue hacking MetaCPAN in Docker)
- Provide a "near-production" Docker Compose configuration, suitable for Docker Swarm, and/or
- Refactor configuration to be suitable for Kubernetes (Google Cloud) deployments
- Docker Compose documentation
- metacpan-developer and metacpan-puppet from which much information about the architecture is based on