Go to file

chickenflyshigh 7e80713191 Updated README + reduced translation mismatch rates + LICENSE		2024-11-07 23:43:04 +11:00
fonts	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
helpers	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
templates	Wayland SS delegated to Grim at reduced quality for faster faster speeds	2024-11-07 11:09:20 +11:00
.gitignore	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
api_models.json	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
config.py	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
data.py	Batched asynchronous API requests. Added additional draw options.	2024-11-06 22:51:03 +11:00
draw.py	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
LICENSE	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
logging_config.py	Wayland SS delegated to Grim at reduced quality for faster faster speeds	2024-11-07 11:09:20 +11:00
main.py	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
qt_app.py	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
README.md	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
requirements.txt	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00
web_app.py	Updated README + reduced translation mismatch rates + LICENSE	2024-11-07 23:43:04 +11:00

README.md

Usage (draft)

Clone the repository, navigate to the repository and install all required packages with pip install -r requirements.txt in a new Python environment (the OCR packages are very finnicky).
If using external APIs, you will need to obtain the api keys for the currently supported sites [Google (Gemini models), Groq (an assortment of open-source LLMs)] and add in the associated api keys in the environmental variables file. If using another API, you will need to define a new class with the _request function in helpers/batching.py, inheriting the ApiModels class. A template is created in the file under the Gemini and Groq classes. All exception handling are already taken care of.
Edit the api_models.json file for the models you want added. The first level of the json file is the respective class name defined in helpers/batching.py. The second level defines the model names from their corresponding API endpoints. For the third level, the rates of each model are specified. rpmin, rph, rpd, rpw, rpmth, rpy are respectively the rates per minute, hour, day, week, month, year.
Edit the .env config file. For information about all the variables to edit, check the section under "EDIT THESE ENVIRONMENTAL VARIABLES". If CUDA is not detected, it will default to using the CPU mode for all local LLMs and OCRs. In this case, it is recommended to set the OCR_MODEL variable to rapid which is optimised for CPUs. Currently the only support for this is with SOURCE_LANG=ch_tra, ch_sim or en. Refer to [notes][1]
If you are using the wayland display protocol (only available for Linux -- check with echo $WAYLAND_DISPLAY), download the grim package onto your machine locally with any of the package managers.

Archlinux-based: sudo pacman -S grim
Debian-based: sudo apt install grim
Fedora: dnf install grim
OpenSUSE: zypper install grim
NixOS: nix-shell -p grim

Screenshotting is limited in Wayland, and grim is one of the more lightweight options out there.

The RapidOCR, PaddleOCR and (maybe I can't remember) the easyOCR models need to be downloaded before any of this can be used. It should download when you execute a function that initialises the model with the desired language to OCR but appears to not work well when running the app directly. (I'll add more to this later...). And this obviously holds the same for the local translation and LLM models...
Run the main.py file and a QT6 app should appear. Alternatively if that doesn't work, go to the last line of the main.py file and edit the argument to web which will run the translations locally on 0.0.0.0:5000 or on any other port you specify.

Notes and optimisations

Accuracy is limited with RapidOCR especially if there is a high dynamical range in graphics. [1]
Consider lowering the quality of the screen capture for faster OCR processing and lower screen capture time -> OCR accuracy and subsequent translations can be affected but entire translation process should be under 2 seconds without too much sacrifice in OCR quality. Edit the helpers/utils.py printsc functions (will work on setting a config for this).
Not much of the database aspect is worked on at the moment. Right now it stores all the texts and translations in unicode/ASCII in the database/translations.db file. Use it however you want, it is stored locally only for you.
Downloading all the models may take up a few GBs of space.
About 3.5GB of VRAM is used by easyOCR. Up to 1.5GB of VRAM for Paddle and Rapid OCR.

Debugging Issues

CUDNN Version mismatch when using PaddleOCR. Check if LD_LIBRARY_PATH is correctly set to the directory containing the cudnn.so file. If using a local installation, it could help to just remove the nvidia-cudnn-cn12 from your Python environment.
Segmentation fault when using PaddleOCR, EasyOCR or RapidOCR. Ensure the only cv2 library is the opencv-contrib-python library. Check out https://pypi.org/project/opencv-python-headless/ for more info.

TODO:

Create an overlay window that works in Wayland.
Make use of the translation data -> maybe make a personalised game that uses

Terms of Use

By using this application, you agree to the following terms and conditions.

Data Collection and External API Use

1.1 Onscreen Data Transmission: The application is designed to send data displayed on your screen, including potentially sensitive or personal information, to an external API if local processing is not setup.

1.2 Third-Party API Integration: When local methods cannot fulfill certain functions, the App will transmit data to external third-party APIs. These APIs are not under our control, and we do not guarantee the security, confidentiality, or purpose of use of the data once transmitted.

Acknowledgment

By using the app, you acknowledge that you have read, understood, and agree to these Terms of Use, including the potential risks associated with transmitting data to external APIs.