Linguist

From COPTR
Jump to navigation Jump to search


Identify the breakdown of programming languages used in a GitHub repository, or the anticipated language of an individual file
Homepage:https://github.com/github-linguist/linguist/tree/main
Status: Maintained ✅
Source Code:https://github.com/github-linguist/linguist/tree/main
License:MIT
Cost:Free and Open Source (FOSS)
Language:Ruby
Function:Content Profiling,File Format Identification

Description

Identify the breakdown of programming languages used in a GitHub repository, or the anticipated language of an individual file.

The most well-known output of linguist's process is the language breakdown graph shown on a repository on GitHub documenting the percentage of languages used along with a small chart visualizing that breakdown.

Unlike utilities like DROID, based on PRONOM, which rely on pattern matching to identify, in this case programming language, and thus, file format, GitHub's linguist is deterministic and uses a series of decision making strategies to determine the potential file format.

Algorithm

Linguist uses a number of strategies to reduce the amount of inputs it is dealing with including excluding binary objects and those identified as data before applying a series of identification strategies to try and determine the programming language used across a GitHub repository or an individual file.

Its algorithm is described in more detail on GitHub.

Among its strategies, linguist lists the following checks:

  • Vim or Emacs modeline,
  • commonly used filename,
  • shell shebang,
  • file extension,
  • XML header,
  • man page section,
  • heuristics,
  • naïve Bayesian classification

User Experiences

Development Activity

All development activity is visible on GitHub: https://github.com/github-linguist/linguist/commits

Release Feed

Below the last 3 release feeds:

2026-06-08 11:03:25
[tag:github.com,2008:Repository/1725199/v9.6.0 v9.6.0]
by lildude
2026-03-18 15:09:57
[tag:github.com,2008:Repository/1725199/v9.5.0 v9.5.0]
by lildude
2026-01-21 10:42:09
[tag:github.com,2008:Repository/1725199/v9.4.0 v9.4.0]
by lildude

Activity Feed

Below the last 5 commits:

2026-06-18 15:26:52
[tag:github.com,2008:Grit::Commit/e9fe3c9f230cd9220afcd057f75702de4d7700c9 Change the C# colour to purple (#8026)]
by violetshine https://github.com/violetshine
2026-06-17 16:08:43
[tag:github.com,2008:Grit::Commit/50bf308a08da29f923774ed161530258e4949564 Add Blueprint language (#8001)]
by kaypes https://github.com/kaypes
2026-06-17 16:03:07
[tag:github.com,2008:Grit::Commit/3d6f1a57f10d336b82bbff9b1af29ffc577bf585 Test against Ruby 3.4 and 4.0 (#8011)]
by byroot https://github.com/byroot
2026-06-17 15:30:38
[tag:github.com,2008:Grit::Commit/1d6a052bbd75741ed1468005317fab97a0539a30 Speedup gem loading (#8010)]
by byroot https://github.com/byroot
2026-06-15 08:53:40
[tag:github.com,2008:Grit::Commit/67681a165b0031a99b47f100148f9b147b19fce5 Add support for 4 Ignore List/Git Config filenames (#8020)]
by Alhadis https://github.com/Alhadis

References

See also

  • CLOC (Count Lines of Code) on COPTR.