Skip to content

Special character \H removed from filepath #492

Open
@stephanedebove

Description

Describe the bug

convert_to_unicode() function interprets \H strings in filepaths as special characters.

Code

This code:

with open(BIB_PATH, 'r', encoding='utf-8') as bib_file:
    parser = BibTexParser()
    parser.customization = convert_to_unicode
    bib_database = bibtexparser.load(bib_file, parser=parser)

running on a bib file containing this entry:

@article{Hagger2022,
  title = {Perceived Behavioral Control Moderating Effects in the Theory of Planned Behavior: {{A}} Meta-Analysis},
  file = {C:\Users\name\Documents\Zotero\storage\7J78GAC5\Hagger et al_2022_Perceived behavioral control moderating effects in the theory of planned.pdf}
}

will remove the "\H" from the filepath, and file path will become:

C:\Users\name\Documents\Zotero\storage\7J78GAC5a̋gger et al_2022_Perceived behavioral control moderating effects in the theory o f planned.pdf

Reproducing

Version: 1.4.2

Workaround
For now, I just rewrote the convert_to_unicode function to skip the file field:

def convert_to_unicode(record):
    for val in record:
        if val == "file":
            continue

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions