Wednesday, December 02, 2015

Python - pkg_resources.DistributionNotFound

Recently I installed a python tool called cassandra_snapshotter (https://github.com/tbarbugli/cassandra_snapshotter), to backup one of our Cassandra cluster. After installation, when I try to run it, I got the following error:

# cassandra-snapshotter-agent
Traceback (most recent call last):
  File "/usr/bin/cassandra-snapshotter-agent", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: boto>=2.29.1

It complained about boto package is not installed. But I did see "boto" package in my "/usr/lib/python2.6/site-packages", and I did see "boto (2.38.0)" from "pip list | grep boto" command, I can also import the boto package from my python command line.

So obviously, "boto" was installed, the problem was: cassandra-snapshotter can't find it. After some research, I found out that the boto package on my server was actually installed by "easy_install", but cassandra-snapshotter was installed by "pip". That where the confusion is from. By doing a

# sudo easy_install -m boto
# rm -rf /usr/lib/python2.6/site-packages/boto-2.38.0-py2.6.egg (easy_install don't have uninstall option)
# sudo pip install cassandra-snapshotter (which installs boto for you)

The problem wnet away. The mixture of both "pip" and "easy_install" led to the "pkg_resources.DistributionNotFound" problem. Always use one of the tools to manage your Python packages!

A little background on pip and easy_install:

To understand the differences between "pip" and "easy_install", you need to undertand "setuptools" first. "setuptools" is a collection of enhancements to Python's distutils that allow you to more easily build and disttibute Python packages, especially the ones that have dependencies on other packages.

So "easy_install" is part of "setuptool", and it was released in 2004, by the time of it release, it was notable for installing packages from PYPI using requirement specifiers, and automatically installing dependencies.

"pip" came in 2008, as alternative to "easy_install", although it still largely built on top of "setuptools". It was notable at the time for not installing packages as Eggs or from Eggs, and introducing the idea of "requirement" files, which gave users the power to easily replicate environments.

No comments: