Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I use the same script as Dibby053, copied from stackoverflow but with some tweaks to work on kde,gnome and wayland as well as x11 and with some notifications on what state it is in.

I didn't test the x11/wayland check yet, but feel free to use it and report back.

  #!/bin/bash 
  # Dependencies: tesseract-ocr imagemagick 
  # on gnome: gnome-screenshot 
  # on kde: spectacle
  # on x11: xsel
  # on wayland: wl-clipboard

  die(){
  notify-send "$1"
  exit 1
  }
  cleanup(){
  [[ -n $1 ]] &&  rm -rf "$1"
  }

  SCR_IMG=$(mktemp)  || die "failed to take screenshot"

  # shellcheck disable=SC2064
  trap "cleanup '$SCR_IMG'" EXIT

  notify-send "Select the area of the text" 
  if  which "spectacle" &> /dev/null
  then
    spectacle -r -o "$SCR_IMG.png" || die "failed to take screenshot"
  else
    gnome-screenshot -a -f "$SCR_IMG.png" || die "failed to take screenshot"
  fi

  # increase image quality with option -q from default 75 to 100
  mogrify -modulate 100,0 -resize 400% "$SCR_IMG.png"  || die "failed to convert image"
  #should increase detection rate

  tesseract "$SCR_IMG.png" "$SCR_IMG" &> /dev/null || die "failed to extract text"
  if [ "$XDG_SESSION_TYPE" == "wayland" ]
  then 
  wl-copy < "$SCR_IMG.txt" || die "failed to copy text to clipboard"
  else
  xsel -b -i  < "$SCR_IMG.txt" || die "failed to copy text to clipboard"
  fi
  notify-send "Text extracted"
  exit

edit:

Formatting



I slightly modified your script to: 1. Clean up properly 2. Run spectacle in BG mode, so the window does not pop up after screenshotting.

  #!/bin/bash 
  # Dependencies: tesseract-ocr imagemagick 
  # on gnome: gnome-screenshot 
  # on kde: spectacle
  # on x11: xsel
  # on wayland: wl-clipboard
  
  die(){
    notify-send "$1"
    exit 1
  }
  cleanup(){
    [[ -n $1 ]] && rm -r "$1"
  }
  
  SCR_IMG=$(mktemp -d) || die "failed to take screenshot"
  
  # shellcheck disable=SC2064
  trap "cleanup '$SCR_IMG'" EXIT
  
  #notify-send "Select the area of the text" 
  if  which "spectacle" &> /dev/null
  then
    spectacle -b -r -o "$SCR_IMG/scr.png" || die "failed to take screenshot"
  else
    gnome-screenshot -a -f "$SCR_IMG/scr.png" || die "failed to take screenshot"
  fi
  
  # increase image quality with option -q from default 75 to 100
  mogrify -modulate 100,0 -resize 400% "$SCR_IMG/scr.png"  || die "failed to convert image"
  #should increase detection rate
  
  tesseract "$SCR_IMG/scr.png" "$SCR_IMG/scr" &> /dev/null || die "failed to extract text"
  if [ "$XDG_SESSION_TYPE" == "wayland" ]
  then 
    wl-copy < "$SCR_IMG/scr.txt" || die "failed to copy text to clipboard"
  else
    xsel -b -i  < "$SCR_IMG/scr.txt" || die "failed to copy text to clipboard"
  fi
  notify-send "Text extracted"
  exit


This is great!

Also made some minor modifications: replaced `xsel` with `xclip` and added truncated version of the copied text to the `notify-send`:

  #!/bin/bash 
  # Dependencies: tesseract-ocr imagemagick 
  # on gnome: gnome-screenshot 
  # on kde: spectacle
  # on x11: xsel
  # on wayland: wl-clipboard

  die(){
    notify-send "$1"
    exit 1
  }
  cleanup(){
    [[ -n $1 ]] && rm -r "$1"
  }

  SCR_IMG=$(mktemp -d) || die "failed to take screenshot"

  # shellcheck disable=SC2064
  trap "cleanup '$SCR_IMG'" EXIT

  #notify-send "Select the area of the text" 
  if  which "spectacle" &> /dev/null
  then
    spectacle -n -b -r -o "$SCR_IMG/scr.png" || die "failed to take screenshot"
  else
    gnome-screenshot -a -f "$SCR_IMG/scr.png" || die "failed to take screenshot"
  fi

  # increase image quality with option -q from default 75 to 100
  mogrify -modulate 100,0 -resize 400% "$SCR_IMG/scr.png"  || die "failed to convert image"
  #should increase detection rate

  tesseract "$SCR_IMG/scr.png" "$SCR_IMG/scr" &> /dev/null || die "failed to extract text"
  if [ "$XDG_SESSION_TYPE" == "wayland" ]
  then 
    wl-copy < "$SCR_IMG/scr.txt" || die "failed to copy text to clipboard"
  else
    # xsel -b -i  < "$SCR_IMG/scr.txt" || die "failed to copy text to clipboard"
    xclip -selection clipboard -i < "$SCR_IMG/scr.txt" || die "failed to copy text to clipboard"  
  fi
  # Notify the user what was copied but truncate the text to 100 characters
  notify-send "Text extracted from image" "$(head -c 100 "$SCR_IMG/scr.txt")" || die "failed to send notification"
  exit


I just frankenstein'd a few people's versions into my own MATE-based flavor.

For anyone running into barriers, mate-screenshot has no outfile `-f` option, so I worked around that by outputting through clipboard and capturing that with `xclip` (note, this is earlier in the script than the the xsel/xclip line in the parent and gp comments):

  mate-screenshot -a -c && xclip -selection clipboard -t image/png -o > "$SCR_IMG/scr.png" || die "failed to take screenshot"
The other hiccup is that the dumped text file has two extraneous bytes '\x0a\x0c', so I truncated them with `head`:

  (xclip -selection clipboard -i < <(head -c -2 "$SCR_IMG/scr.txt")) || die "failed to copy text to clipboard"
Might not be pretty, but it looks like this will work for me. Thank you all for this!


Good catch with spectacle, I thought I fixed that already.

Why did you remove the -f parameter?


I like all the error handling, but you could skip the temp files if you just pipe it through

    #!/usr/bin/env bash
    langs=(eng ara fas chi_sim chi_tra deu ell fin heb hun jpn kor nld rus tur)
    lang=$(printf '%s\n' "${langs[@]}" | fuzzel -d "$@")
    grim -g "$(slurp)" - | mogrify -modulate 100,0 -resize 400% png:- | tesseract -l eng+${lang} - - | wl-copy
    notify-send "Text extracted"


If you just put `set -o errexit -o pipefail -o nounset` in the first line after the shebang your script will have proper error-handling as well. Currently if any fails, notify-send will still be triggered.


This version looks nice and short, any thoughts on prober error reporting to the end user?

My version has more feedback for the user which was important because the user was somebody not familiar with linux/bash, but even my version "swallows" errors.


I added the `set pipefile...` suggested below, but I think mogrify only fails if the screenshot fails. Tesseract never fails if there is a valid input image, so realistically you only need one error message for the screenshot generation, unless you want to check whether the user misses any of the tools.


I also used the very same script until I stumbled upon this on hn [0].

    #!/usr/bin/env bash
    langs=(eng ara fas chi_sim chi_tra deu ell fin heb hun jpn kor nld rus tur)
    lang=$(printf '%s\n' "${langs[@]}" | dmenu "$@")
    maim -us | tesseract --dpi 145 -l eng+${lang} - - | xsel -bi

[0]: https://news.ycombinator.com/item?id=33704483#33705272


Ah just saw rjzzleep posted an updated version here. Happy to steal this one again :)


Looks nice


    # shellcheck disable=SC2064
    trap "cleanup '$SCR_IMG'" EXIT
While shellcheck can have false positives, and SCR_IMG probably doesn't have any characters which need escaping, it's not exactly wrong in this case.

The command passed to `trap` is evaluated normally, so variable expansions do take place.

    trap 'cleanup "$SCR_IMG"' EXIT
Will behave correctly, and the expansion of SCR_IMG won't be susceptible to issues relating to unquoted shell characters.

Alternatively, if you're using a modern bash (this probably won't work on a mac by default), then this is an option too:

    trap "cleanup ${SCR_IMG@Q}" EXIT


thanks for fixing and explaining that, I thought '' would work and forgot about escaping characters.


Binding a hotkey to `bash -c 'flameshot gui -s -r | tesseract - - | gxmessage -title "Decoded Data" -fn "Consolas 12" -wrap -geometry 640x480 -file -'` does the job for me.

I just press the hotkey (Super+O), drag the selection over whatever I want to OCR, then immediately get a popup dialog containing the captured text.


The Wayland leg works fine for me on gnome+wayland.


thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: