Yesterday I was updating one of the specifications I maintain at work for my day job.

I have previously blogged about configuring the use of Markdownlint for use with a GitHub Action, migrating from Travis CI to GitHub Actions for this part and I had for some time wanted to venture into more extensive use of GitHub Actions for this work.

The issue addressed was some misinformation in the description of a process, which was obsoleted. It was not something I could have caught using software or similar, but it got me thinking if there where other parts of this maintenance where GitHub Actions could be of assistance and tt struck me that perhaps checking for spelling errors could be an area where GitHub Actions could help.

I did a quick search on the GitHub MarketPlace and three Actions were already available.

As I commented on my mentioned blog post often somebody has already implemented the action you need and it can save you a lot of time, using something which is is already out there, instead of rolling your own.

I use VScode as my primary editor and I have both spell checking and Markdown linting integration in the editor using extensions, but sometimes I miss the reported problems and commit anyway. This argues for implementing these as Git pre-commit hooks, but for now CI using GitHub Actions is my safety net. And for occasional PRs I cannot be sure the contributors toolchain matches my own, even though relevant configuration files are included in the repositories, so GitHub Actions are useful.

Now lets get down to business.

I decided on "Spellcheck Action" since it had 17 stars.

I started by reading over the documentation. The documentation on the Marketplace was quite sparse. It did mention use of PySpelling and the possibility of specifying a spellcheck.yaml to overwrite the default configuration, which all sounded very good and useful. It was only release 0.2.0, but I am not so hung up on version numbers and it sounded like it would fit my use-case.

Next up was reading the code. The implementation was based on a Docker image, which also suited me fine, since the little experience I have with GitHub Actions is with using a Docker based solution and not JavaScript, which is the other option.

Oh yeah and I got a PR created, since I fell over something which I believe to be a spelling error in a configuration example.

I added the action to a new unpublicized repository I am setting up for a new initiative and started to configure it.

Many, many attempts later, I decided to take a break and decided do something else.

The first problem was simply the action complaining about missing the required dictionary file: wordlist.txt

cp: cannot stat '/wordlist.txt': No such file or directory
['aspell', '--lang', 'en', '--encoding', 'utf-8', 'create', 'master', '/github/workspace/wordlist.dic']
Current wordlist: 'wordlist.txt'
Problem compiling dictionary. Check the binary path and options.
Traceback (most recent call last):
  File "/usr/local/bin/pyspelling", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 34, in main
    debug=args.debug
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 59, in run
    debug=debug
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 672, in spellcheck
    for result in spellchecker.run_task(task, source_patterns=sources):
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 307, in run_task
    personal_dict = self.setup_dictionary(task)
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 350, in setup_dictionary
    output
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 380, in compile_dictionary
    with open(wordlist, 'rb') as src:
FileNotFoundError: [Errno 2] No such file or directory: 'wordlist.txt'
Enter fullscreen mode Exit fullscreen mode

I tried a variation of solutions adding an empty wordlist.txt

$ touch wordlist.txt
Enter fullscreen mode Exit fullscreen mode

But to no avail.

I returned later in the afternoon after some pondering, still without luck and I decided call it quits for the day.

Later it came to me that I had ignored all best practices in problem solving. Instead of was just firing commits at the problem, thinking:

this will do it!

I decided to take a step back and examine the Dockerfile as a stand-alone container on my on machine instead of wasting resources evaluating possibles solutions via GitHub and the resources allocated to running the actions, which for this particular action, which is not particularly fast, I needed to speed up the feedback process and I still felt that it was me who was missing something and I was not using the action correctly.

On a side note, this is one of the reasons I love open source, if you have a problem, you can peek at the innards and often even poke at them to get the them to behave.

Getting the Docker image to build was quite easy,

$ docker build -t github-action-spellcheck .
Enter fullscreen mode Exit fullscreen mode

Running it not so much since the context of Action was not really set up. In the examples I have seen all actions work on the checked out project sort of magically.

$ docker run -it github-action-spellcheck
Enter fullscreen mode Exit fullscreen mode

¯\_(ツ)_/¯

A lot of information is available on the context of the action in GitHub. however I decided to not focus on the GitHub integration, but the basic container and I went for a small detour to understand the action as a whole.

  1. It was a Docker based solution, with a Dockerfile and an ENTRYPOINT file: entrypoint.sh
  2. PySpelling is a encapsulation of aspell a widely adopted component for spelling correction

I installed aspell via Homebrew and tried it out. It worked like a charm and is good to have as a backup for doing more interactive editorial work and in the long run for building up dictionaries.

I installed PySpelling, the recipe for this was extracted from the Dockerfile.

RUN pip3 install pyspelling
Enter fullscreen mode Exit fullscreen mode

After repeating that step locally I could emulate the work done by the ENTRYPOINT outlined in the entrypoint.sh file.

pyspelling -c spellcheck.yaml
Traceback (most recent call last):
  File "/usr/local/bin/pyspelling", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 34, in main
    debug=args.debug
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 59, in run
    debug=debug
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 672, in spellcheck
    for result in spellchecker.run_task(task, source_patterns=sources):
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 316, in run_task
    for sources in self._walk_src(source_patterns, glob_flags, glob_limit, self.pipeline_steps, expect_match):
  File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 216, in _walk_src
    '\n'.join('- {}'.format(target) for target in targets)
RuntimeError: None of the source targets from the configuration match any files:
- **/*.py
Enter fullscreen mode Exit fullscreen mode

Do note the above output was added after, when retracing my steps, based on the default spellcheck.yaml.

In my commit frenzy (ref: April 4.) I had altered the file, to focus on Markdown targets.

matrix:
- name: Markdown
  aspell:
    lang: en
  dictionary:
    wordlists:
    - wordlist.txt
    output: wordlist.dic
    encoding: utf-8
  pipeline:
  - pyspelling.filters.markdown:
  - pyspelling.filters.html:
    comments: false
    ignores:
    - code
    - pre
  sources:
  - '**/*.md'
  default_encoding: utf-8
Enter fullscreen mode Exit fullscreen mode

AND FINALLY we failed with a success:

$ pyspelling -c spellcheck.yaml
Misspelled words:
<htmlcontent> README.md: html>body>ul>li
--------------------------------------------------------------------------------
Pragma
Readonly
--------------------------------------------------------------------------------

!!!Spelling check failed!!!
Enter fullscreen mode Exit fullscreen mode

NB: The above example is from another repository, but you get the picture. Again recapping exact steps is not always feasible.

I had found out the my configuration files were working, the tools were working (locally though), which left me with the Docker integration. It seemed the Docker integration did not put the files in the right place so Python component could examine use the configuration files. As I wrote earlier at lot of context was available in GitHub in the action logs, but still focus was still on getting it to work locally before attacking the action, which had already been attempted with failure.

In desperation I went over the actions repository and fell over an reported issue.

  • "Marketplace still using 0.2.0 which has wordlist.txt issue."

Heh, should perhaps have checked this the moment I observed my issues - well.

I had gotten around to familiarize my self with the action's implementation and yet still a n00b in actions I felt like I was close to a solution.

As mentioned running the locally built Docker image was pretty useless.

$ docker run -it github-action-spellcheck
Enter fullscreen mode Exit fullscreen mode

¯\_(ツ)_/¯

What if I provided the repository to the Docker image like so:

$ docker run -it  -v $PWD:/ github-action-spellcheck
docker: Error response from daemon: invalid volume specification: '/Users/jonasbn/develop/github/blog-examples:/': invalid mount config for type "bind": invalid specification: destination can't be '/'.
See 'docker run --help'.
Enter fullscreen mode Exit fullscreen mode

Next attempt using a non-root directory in the container, which required a minor change to the Dockerfile so the following line was added just above the ENTRYPOINT entry.

WORKDIR /tmp
Enter fullscreen mode Exit fullscreen mode

New attempt:

$ docker run -it  -v $PWD:/tmp github-action-spellcheck
Enter fullscreen mode Exit fullscreen mode

And it worked and now I had working tools, configuration files and a working Docker image. Now I needed to get it to work as a proper action, which was the original goal.

The good thing about open source is that it is so easily available. The Spellcheck action is available on GitHub under an MIT license, which made it possible to address the issues without fear of repercussions, so lets go over the changes.

  1. The already mentioned change of setting WORKDIR in the Dockerfile made it possible to test locally.
  2. I altered the content of the ENTRYPOINT file: entrypoint.sh, this was not crucial to make it work, but it suits my temper better

The original:

#!/bin/bash
if [ ! -f ./spellcheck.yaml ]; then
    cp /spellcheck.yaml .
fi

if [ ! -f ./wordlist.txt ]; then
    cp /wordlist.txt .
fi

pyspelling -c spellcheck.yaml
Enter fullscreen mode Exit fullscreen mode
  1. I have a hard time with defaults, so I removed the copying in of the actions own spellcheck.yaml and wordlist.txt. If would much rather have the action crash and burn if preconditions were not met, than applying some general policy and dictionary
  2. I would prefer if the user where told to include a spellcheck.yaml and in this the user should specify a wordlist.txt if needed. The documentation could prove pointers on basic configurations for repositories with different contents, like Markdown, HTML, Python etc.
  3. I would prefer the spellcheck.yaml to be a hidden file due to the fact, that it is a basic configuration, not the primary contents of a action using repository, so a name of: .spellcheck.yaml should be chosen, the recommendation should be for the wordlist.txt to adhere to the same policy, using the name .wordlist.txt in the documentation and in referred to as this in the configuration examples.
#!/bin/sh -l

SPELLCHECK_CONFIG_FILE=''

if [ -f ./.spellcheck.yaml ]; then
    SPELLCHECK_CONFIG_FILE='.spellcheck.yaml'
fi

if [ -f ./.spellcheck.yml ]; then
    SPELLCHECK_CONFIG_FILE='.spellcheck.yml'
fi

echo ""
echo "Using pyspelling on repository files outlined in $SPELLCHECK_CONFIG_FILE"
echo "----------------------------------------------------------------"

pyspelling -c $SPELLCHECK_CONFIG_FILE

EXITCODE=$?

test $EXITCODE -eq 0 || echo "($EXITCODE) Repository contains spelling errors or spelling check failed, please check diagnostics";

exit $EXITCODE
Enter fullscreen mode Exit fullscreen mode

Going over the issues I had fallen over another open issue, where somebody was naming their YAML file: spellcheck.yml using yml as the suffix, which is a more widely adopted naming convention, so my take on the entrypoint.sh also takes this into consideration.

I have adopted my local implementation for several of my repositories with more to come. This is the opposite of my first recommendation of using the available component if one exists, but it does demonstrate another _power

roll your own, if what is available is not working for you or match_your use-case

I can see that I can come in a situation where I would have to maintain several of these actions, so a common action or Docker container, should be the approach. Right now I need to backport the latest changes from the last repository I touch to those already having a similar action implemented locally.

I love to experiment and learn using, hacking, generalising is a good exercise, but I would prefer to work on the actual contents of my repositories and not the infrastructure.

I am by no means unhappy with the original Spell Action or it's author @rojopolis. I learned a lot and I see several points of improvement to my own toolbox.

  • Can I get the repository dictionary to be shared between VScode and the PySpelling based action, like I can with Markdownlint?
  • Can I put aspell to better using during my writing process?`

And finally - I could deci to roll my own action based on the work done by @rojopolis at the same time, I would prefer to send my proposed changes upstream, since this is more the open source way and it would solve the maintenance burden for me and others - if @rojopolis is unresponsive after a period of time, setting up my own project based on a fork could be the way ahead.

Next steps:

  1. Backport changes to all my actions
  2. Create PR with proposed changes and improved documentation
  3. Make further use of the action, preferably based on the original as a Docker container instead of my own

There are lots of grey areas in this post, lots of uncharted territory, if you can fill in some blanks please comment on the post. Feedback most welcome, smarter ways, better solutions, questions and insights.

Take care and watch out for each other.

Logo

ModelScope旨在打造下一代开源的模型即服务共享平台,为泛AI开发者提供灵活、易用、低成本的一站式模型服务产品,让模型应用更简单!

更多推荐