Hugo Technical


Hugo Conf 2022

The link for the Hugo conf is: https://hugoconf.io/

You can also see the slides of my presentation on Lexer, Parser and Goldmar.

Why Hugo?

When I had to choose my Static Site generator, I watched Brian Rinaldi’s YouTube video How to choose your static site generator and read his article on Stackbit. I digged a little bit deeper into Hugo with Ryan Schachte’s video # Creating a Blog with Hugo and Github in 10 minutes.

In the end, the choice was between Hugo and Jekyll. I went with Hugo because Jekyll is written in Ruby and Hugo in the Go programming language, which is more similar to C and thus easier for me to work with.

Or at least, that’s what I told myself to justify the choice I made. In truth, Jekyll website was just messy and old style, compared to Hugo’s minimalistic approach. I just had a better feeling with Hugo.

Various

Inner structure

  • hugolib/page__content.go:contentToRender()

    • called from hugolib/page__per_output.go:newPageContentOutput
      • has lambda function initContent(), calling contentToRender()
    • has input pm pageContentMap
    • pageContentMap is a list of
      • pageparser.Items
      • pageContentReplacement
      • shortcode - either rendered or placeholders
    • return byte (string) to be processed by Goldmark
  • Markup: [link3conshort]({{< relref “page1” >}})

  • Placeholders for shortcodes are called by hugolib/page__per_output.go:newPageContentOutput() -> initiContent (closure) -> p.shortcodeState.renderShortcodesForPage(p, f)

    • fills cp.contentPlaceholders
  • The content, already replaced by the placeholders, is loaded in hugolib/page__per_output.go:newPageContentOutput() -> initiContent (closure) -> p.contentToRender()

    • receives cp.contentPlaceholders
    • populates cp.workContent
  • hugolib/page__per_output.go:renderContent

    • receives input (content): [link3conshort](HAHAHUGOSHORTCODE-s0-HBHB)

hugolib/shortcode.go:renderShortcodesForPage

  • input:
    • p *pageState
    • f output.Format
  • output:
    • map[string]string, bool, error
  • s *shortcodeHandler
  • type shortcode (name of shortcode, params, placeholder, pos, length etc. etc.)
  • type shortcodeHandler
    • p *pageState
    • s *Site
    • shortcodes []*shortcode // Ordered list of shortcodes for a page.
  • shortcode delimiters “{{”… are in parser/pageparser/pagelexer_shortcode.go (da leftDelimScWithMarkup diventa tLeftDelimScWithMarkup)
    • with starting “t” only item types (generic const)
    • without starting “t” are the real shortcodes
    • in parser/pageparser/item.og is const string
    • in pagelexer_shortocde.go are the codes var = []byte
    • in itemtype_string.go converts between var and string

must watch hugolib/shortcodes.go pg. 580

  • pt *pageparser.Iterator what does the method Next() return?
    • returns type Iterator struct { l (elle) *pageLexer, lastPos int }
    • pageLexer is in parser/pageparser/pagelexer.go
      • type pageLexer {input []byte,stateStart, state, pos}
  • s.shortcodes what are they? (used in range)
  • calls (row 525)
    • renderShortcode(0, s.s, tplVariants, v, nil, p)
    • v is one of s.shortcodes

who calls whom

hugolib/content__map_page.go:newPageFromContentNode

  • up to line 190
  • parseResult = pageparser.Parse( r) (around row 166 )
  • so pageparser.PArse splits the chunks

parser/pageparser/pageparser.go:Parse

  • called by hugolib/content__map_page.go:newPageFromContentNode
  • divides the text in Items
  • calls right away o parseSection, that reads from memory and calls parseBytes
  • parseBytes calls lexer := newPageLexer(b, start, cfg) e lexer.run()
  • lexer.run() calls l.state() see row 101
    • lexer.state is a closure (stateFunc)
    • lexer.stateStart is defined by lexIntroSection o lexMainSection, that are in parser/pageparser/pagelexer_intro.go for intro or parser/pageparser/pagelexer.go:500 for lexMainSection
    • ==We must focus on lexMainSection - sectionHanders.skip==
    • in pagelexer.go:365 isShortCodeStart()

==parser/pageparser/pageparser.go:parseBytes== is not called anymore , starts newPageLexer e lexer.run con lexIntroSection

  • lexIntroSection returns lexMainSection
  • newPageLexer (pageparser/pagelexer.go) returns *pageLexer inizializzato con sectionHander (createSectionHandler)
  • pageLexer.run() (pagelexer.go:93)
    •     for l.state = ==l.stateStart==; l.state != nil; { l.state = l.state(l)  }
    • calls lexIntroSection (stateStart) and then lexMainSection

==shortCodeHandler== parser/pageparser/pagelexer.go:315

  • skipFunc (l.index(leftDelimSc))
  • lexFunc ()

hugolib/page.go:mapContent

  • p.source.parsed: has all the text and content of the md file
  • but the code is already in slices
  • called by hugolib/content__map_page.go:newPageFromContentNode (93)

hugolib/page.go:mapContentForResult

  • digests all the chunks contained in mapContent
  • once it finds a delimiter (già predigerito), parses it and closes it without changing mapContent
  • called by hugolib/page__per_output.go:RenderString(368)
  • or from hugolib/page.go:mapContent

What I understood of the internal structure

//go:generate stringer -type=ItemType

  • Parser calls il lexer
  • il lexer splits the page in chunks, among which frontcode, shortcode, summaryDivider, populating Items in pageparser.Result.Iterator()
  • then mapContentForResult goes through the Items and splits them in it.IsFrontMatter, etc. etc.
  • So the first step is:
    • create il tLeftDelimInternalLink e tRightDelimInternalLink
    • change createSectionHandler so that includes internalLinkHandler (insert items with an emit)
      • create pagelexer_internallink.go
      • but pagelexer, does not check if links are closed correctly? I believe not because it checks only for EOF