Skip to content

Add a datasets.toy_dataframe()? #1189

Open
@jeromedockes

Description

(or some better name)

a function that creates and returns a small dataframe with columns of various types that we can use in the docstring examples. the goal is to avoid the need of creating the data in the examples, and also not need to download an actual dataset.

first step is make a list of docstrings that could benefit from such a function and seeing if we can come up with one that would suit a good proportion of docstrings

an alternative could be using one of the scikit-learn datasets that don't need network access, but I think IIRC they will not have many different types in their columns. maybe the titanic one?

note we have a similar function as a pytest fixture but the goal is different: cover many different weird combinations of int, float, missing values etc. whereas the new function would be for illustration not testing

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions