Skip to content

Getting lists of dependent packages #3

Open
@grlee77

Description

It is useful to know who is using scikit-image when planning funding proposals. I was looking a little into how one can extract information such as that presented in GitHub's dependents view.

So, far it seems that it is possible to query the dependencies of scikit-image via an experimental API, but there is no public API for querying the dependent packages. You can browse it manually, but that is tedious given that there are > 1,000 packages in our case!

However, I found that with some modifications to the web scraping script from this stackoverflow post, we can extract this information into a list of packages along with the # of stars and forks for each dependencies.

We can then combine that with use of PyGitHub to retrieve "topics" associated with each of these packages, so that we can sort by number of stars and filter out to only those packages containing certain terms in the repository name or topic list (e.g. "brain, cell, mri, microscopy, etc.").

Running this script on scikit-image gave a list of 857 packages that depend on scikit-image and are active (i.e. are not represented by a "ghost" icon in the web interface). Of these:

  • 225 packages have >= 25 stars
  • 212 packages have between 5 and 24 stars
  • 420 have < 5 stars

The numbers above are for ALL application areas. I excluded packages with < 5 stars and then filtered to retain only those that have names/topics related to bioimaging, microscopy, medical imaging, etc. This results in a final list of

  • 35 packages with >= 25 stars
  • 38 packages with 5-24 stars
Topic Terms Used to Determine Biological Application Status bioimage_search_terms = [ 'airways', 'anatomy', 'arteries', 'astrocytes', 'atomic-force-microscopy', 'afm', 'axon', 'bioimage-informatics', 'bioinformatics', 'biologists', 'biomedical-image-processing', 'bionic-vision', 'biophysics', 'brain-connectivity', 'brain-imaging', 'brain-mri', 'brain-tumor-segmentation', 'brats', 'calcium', 'cancer-research', 'cell-biology', 'cell-detection', 'cell-segmentation', 'computational-pathology', 'connectome', 'connectomics', 'cryo-em', 'ct-data', 'deconvolution-microscopy', 'dicom', 'dicom-rt', 'digital-pathology-data', 'digital-pathology', 'digital-slide-archive', 'dmri', 'electron-microscopy', 'electrophysiology', 'fluorescence', 'fluorescence-microscopy-imaging', 'fmri', 'fmri-preprocessing', 'functional-connectomes', 'healthcare-imaging', 'histology', 'voxel', 'microorganism-colonies', 'microscopy', 'microscopy-images', 'neuroimaging', 'medical', 'medical-image-computing', 'medical-image-processing', 'medical-images', 'medical-imaging', 'mri', 'myelin', 'neural-engineering', 'neuroanatomy', 'neuroimaging', 'neuroimaging-analysis', 'neuropoly', 'neuroscience', 'nih-brain-initiative', 'openslide', 'pathology', 'pathology-image', 'radiation-oncology', 'radiation-physics', 'raman', 'retinal-implants', 'scanning-probe-microscopy', 'scanning-tunnelling-microscopy', 'single-cell-imaging', 'slide-images', 'spectroscopy', 'spinalcord', 'stm', 'stem', 'stitching', 'structural-connectomes', 'tissue-localization', 'tomography', 'volumetric-images', 'whole-slide-image', 'whole-slide-imaging', ]
Search terms in project name string reponame_terms = [ 'brain', 'cell', 'ecg', 'eeg', 'medi', 'mri', 'neuro', 'pathol', 'retin', 'slide', 'spectro', 'tissue', 'tomo',]

A detailed list of dependent biology-related packages with 5 or more stars is given in the table in next comment

Two caveats:
1.) The above list is probably a lower bound. There may be other packages that did not list any "topic" terms and did not use an obvious biology-related term in the project name.
2.) The above list is only downstream Packages. There are probably an order of magnitude more one-off repositories of individual users that are making use of scikit-image, but not packaging/distributing their code.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions