Smart Wristwatch for Visually Impaired
Year: From 2018
Position: Invetor, Designer
Problem & Purpose
The fact that visually impaired are having difficulties in understanding their environment, and the price/uselessness of present visually impaired systems is a huge problem today. The purpose of this project is developing a wristwatch sized, low-cost device that tells the objects and the text it has detected with the camera from the headphones in order to make the life of visually disabled easier.
YouTube video of VISION – Bilgegöz
Open Source Code and 3D Model: CLICK HERE TO SEE
- Being the first system that can do object detection and text recognition at the same time.
- Being the first device that detects the distance and the direction of the objects in addition to their definitions.
- Being the first device in the shape of a wristwatch which helps visually disabled by giving them services like object detection and text reading.
- Supporting hand coordination as being in the form of a wristwatch.
- Not requiring any training or familiarization process, thanks to its capability of talking.
- Having software that can maintain an information flow in a lot more advanced level with the machine learning techniques to the visually disabled.
- Doing object detection using a model developed with Google’s rich machine learning platform Tensorflow and thus reaching accuracy rates up to 97%.
- Ability to read texts with high accuracy in many different conditions using the Tesseract OCR software developed by Google.
- Functioning in 4 different languages: Turkish, English, French, and German. And this can be increased.
- Being the first blind assistive system under 100 TL. (~17$ June 2019)
- Being made with easy production methods such as 3D printing.
- Contributing to recycling as having an old cell phone battery.
- Being able to reach everybody, including people with low socioeconomic levels, thanks to its low cost and easy production.
- Being a cheap and capable system thanks to having a high-level software and being made with low-cost electronic components.
General Mechanical Structure
The wristwatch form factor is a convenient one for the device to be made, in order to:
- Ensure hand coordination
- Provide ease to the user and create an ergonomic device
- Ensure it’s portable and prevent it from being a difficulty to the user
- Avoid the user from feeling abnormal (As a normal wristwatch)
Bilgegöz -designed in the sizes of 80x76x31 mm- has a body printed by a 3D printer. This device -which is 3D printed using PLA material- has a very durable body that is not dangerous for health.
The program used for the 3D model: Tinkercad
Some of the design features:
- Top lid can be opened
- Watch cord from a bag strap
- Honeycombed lid for the cooling
- Camera is looking upwards with 7° degrees
Ports of the device:
- Headphone jack (3.5mm)
- Charging port (MicroUSB)
- MicroSD card slot
- Debugging ports
The control of the user over the device is performed using two separate switches:
- The first one is the power switch located on the back of the device.
- And the toggle switch on the right side of the Bilgegöz acts as a mode switcher between the two operating modes -OCR and object detection.
The electrical connections of the Bilgegöz are made by soldering the components to each other using thin semi-flexible colorful cables. The battery of the device is laid to the bottom, thus creating more space for the other materials. Also, the connections are made for the shortest possible cable configuration.
DC-DC Step-Up Converter
Raspberry Pi Zero Computer
In order to do object detection and text recognition with Bilgegöz, a “USB snake camera” was used by tearing down its container and taking its internal circuit. The reason for this camera to be used is that it has a very small-sized circuit. The connection between this component and Raspberry Pi was made possible by soldering it to the micro USB port of the Zero with four thin wires.
The reasons why this camera was chosen:
- Reasonable price (25 TL),
- Having sufficient resolution (3.2 MP),
- Being lightweight and compact (39,8×12,5x9mm 3g),
- Functioning with a very easy to use interface such as USB and thus only requiring four cables to be soldered to the computer,
- Allowing focal point adjustment.
- Not having a very wide angle of view(78 degrees)
Camera of Vision
Camera of Vision
There is no headphone jack on the Raspberry Pi Zero. The connection between the headphones and the Bilgegöz/Vision is made possible by the sound card designed for this project. And kind of speaker or headphone can be connected to the 3.5mm jack available on this particular sound card as an output device.
How does it work?
- Analog sound signals are imitated from the pins GPIO 13 and 18 of the Raspberry Pi using PWM,
- A digital and stereo sound is output,
- This audio output is made suitable for the headset with “Low Pass Filter,” which is built for this project.
Low Pass Filter, can eliminate the signals in unwanted high-frequency level and let the required low-frequency signals to pass only.
For making this filter following components were used;
- 2 pieces of 270 Ω and 2 pieces of 150 Ω resistors,
- 2 pieces of 10 µF and 2 pieces of 0.33 µF capacitor.
Thus, the 5V signals coming out of Raspberry Pi are adapted to the sound interface.
We have started the software development of this project with the goal of letting the user learn about the present objects in front and read the texts at the same time with Vision. This goal was going to be possible by running the Object detection and Text recognition software simultaneously.
However, because this process requires high processing capacity, the computer used in this project -the Raspberry Pi Zero- can not fulfill these processes.
The step taken to attain this goal is to establish a system that can switch between these softwares. Thus, -though not simultaneous- the required mode can be selected within the judgment of the user whenever demanded. Thus, both missions are satisfied.
The models developed with machine learning techniques (Such as artificial neural networks) and the two software (Tesseract OCR ve Tensorflow), were chosen because they were suitable for the purpose of the project.
Object detection on Vision was accomplished using a model built by the Tensorflow’s Object Detection API. (ssdlite_mobilenet_v2_coco_2018_05_09 to be exact) A unique data transferring software, the “BLINDSOFT_TRANSFUSION.py,” was developed for this project and added to work with the model. This software can determine the definition, direction, and the distance of any object in the frame of the camera and report it to the user. For now, 90 objects can be identified by the Vision, yet this can easily be upgraded.
- Protobuff Package
- Object Detection API
- Ejde Electronics’ model
Raspberry Pi Zero Compatibility Exercises:
The versions of OpenCV, TensorFlow, and Protobuff are changed with the ones that are compatible with the ARMv6 processor on the Zero model.
Raspberry Pi Model 3: 0.8 FPS
Raspberry Pi Zero: 0.07 FPS
- Performs the object detection.
- Is compatible with the PiCamera and the Webcam.
- Tags the Objects by defining them.
- Determines the location of the objects as start and the end. (Eg. Xmin, Xmax, Ymin, Ymax)
- Calculates these data for all of the objects within the frame -in a for a loop- and transfers them to the “BLINDSOFT_TRANSFUSION.py” with the function named “classtransfer()”-which are a software and a function developed for this project-.
Meanings of the variables shown in brackets:
- 1-classes; the name of the identified object
- 2-boxes; Xmax, Xmin, Ymax ve Ymin
- 3-IM_WIDTH,IM_HEIGHT; size of the window: 640×480
- 4-lang; demanded language for the object to be defined
With the Classtransfer() function variables like; definition word of the object, width, height, Xmin, Xmax, Ymin, and Ymax can be transferred to the “BLINDSOFT_TRANSFUSION.py.” Thus with this software, the distance to the object can be calculated by courtesy of the received values. Also, it becomes easier for all of the data to be delivered to the user with text to speech. Trigonometric ratios and Pythagorean Theorem were used in the calculation of the distance to the objects.
Variables shown in green in the figure are in pixel type. Their meaning:
Wa = IM_WIDTH is the width of the canvas. 640 for this project,
Wo = ∣Xmax – Xmin∣,
W1 = ∣midpoint of the canvas horizontally – the midpoint of the object horizontally∣,
Variables shown in orange and beginning with the X letter are the centimeter equivalent of the pixel values above:
The real width of every object in centimeters is pre-defined to the Vision. Therefore these pixel values can be proportioned to the real distances, and with the help of trigonometric ratios and Pythagorean Theorem, the real distance to the object can be attained.
Another variable transferred with the “classtransfer()” function is called “lang,” and with this, the language of Vision is defined. Bilgegöz/Vision supports four languages; Turkish, English, German, and French. Also, new languages can be added to it.
The direction of the objects is determined by comparing the midpoints of the object and the canvas.
OCR – Optical Character recognition
This is the second operating mode of the Vision and functions by utilizing Tesseract OCR and Espeak software.
With the help of Tesseract OCR any optical content within the camera frame is converted to the text format. Tesseract OCR, program employs a supervised machine learning method and can recognize every character pre-defined (tagged) to itself.
“tesseract /home/pi/ocr/images/1.jpg stdout”
- It is prompted to the command line using the OS library in Python.
- It is run by the “Object_detection_2.py.”
- It converts the optical characters to texts in the frame given to itself in a specific moment.
- This process is repeated for every frame. –After the text viewed by the camera is refreshed continuously and delivered to the user with a synthetic voice.
Text to Speech
The texts in the ”BLINDSOFT_TRANSFUSION.py” program should be delivered to the user with a voice. For this, a library called Espeak is deployed. This library:
It can operate in 6 different tones and 51 different languages, including Turkish, English, and German.
The required commands for it to function are given with OS library too:
“espeak -vtr ” + output + ” 2>/dev/null”, command supplies a text content to Espeak,
The sound file is generated. The sounds are reached to the user by playing them on the Raspberry Pi.
The texts can be read continuously by applying this to every frame.
Survey with Association of Visually Impaired
The device is made to be tried in The Association of Visually Impaired at İzmit in a breakfast activity. Thus usage experiences and development suggestions are evaluated with the help of this survey. Thanks to the criticism and suggestions received, new development plans were made for the Bilgegöz project. And different prototypes were created for different usage scenarios.
The obtained suggestions as to the result of the survey:
- Choices for the location of the Camera should be increased (Glasses, mobile phone, hat, badge, on the arm…)
- Navigation and obstacle warning systems should be added
- Facility for adjustable volume and higher volume should be available
- The device should be smaller and lighter
- Should recognize the faces
Bilgegöz Second Version
The software of the Bilgegöz project is compatible with different devices created for dissimilar usage scenarios in various configurations.
Mobile App Version (Android)
Bilgegöz mobile app was developed for Android by making its software infrastructure compatible with Java language and the Android platform. Thus more systems are offered to the user for using this Bilgegöz technology.
- Services like object detection and recognition were offered to the blind users.
- In object detection, accuracy rates up to 97% were obtained.
- It was attained that these services could be done most appropriately by developing an optical-based system through the camera.
- The form factor of a wristwatch, which was the most suitable regarding the ergonomy, is served to the user.
- In the production of the body, a 3d printer was used. Thus, a strong, light, and appropriate to the purpose design were created.
- In order to make the device easy to attach on the arm and cheap, a cord from bag strap was produced.
- Positions of the ports were chosen as the most appropriate for the user.
- A 2600mAh 3.8v battery was thought to be appropriate, considering the power consumption.
- Raspberry Pi Zero, we’re chosen due to its small size and low cost.
- Using the TP4056 charging circuit device made capable to be charged all the time.
- Appropriate voltage to the Raspberry Pi were obtained using a Dc-Dc Step-Up Converter.
- In order to get digital audio output from the device, analog signals were imitated using PWM digital signals. And with the purpose of getting the output signals from the pi into the required gap, a Low Pass Filter were designed. This is the component called the sound card of the Vision, and is the component produced specifically for this project.
- The smallest webcam in the market were chosen as the camera of the device. At the same time, this camera is very low cost too.
- A toggle switch was used for the Vision to switch between the two modes: object detection and the OCR.
- Tensorflow and OpenCV libraries were used for object detection.
- EjdeElectronic’s “Object_detection_2.py” software was modified for also being suitable for distance /location measurement and conversion to speech besides the detection.
- For reading texts, Tesseract OCR library was utilised. And the code required for this library to function was added to “Object_detection_2.py” software.
- Object’s identity (Eg: car), location (Eg: Northwest ), distance, and the text values are delivered to the user with voice using a “text to speech” software called Espeak.
- With this device, the fact that object detection and text recognition can run simultaneously was proved. Thus, these two services were delivered to the blind user with complete simultaneously, and it was made possible for their life to be easier.
- Very low error rated results can be obtained even with a very simple webcam and a cheap computer thanks to the powerful structure of optical-based machine learning.
- The devices for visually disabled are very complex and expensive. However, this device only costs 100 TL, so it shows that the device can show the required performance, fulfilling the object detection and text recognition tasks.
- Thus, it is proved that even a very simple structured device can take the costs of health devices, especially the blind assistants down, thanks to having software that is very powerful and directed to the goal.
Conclusion and Discussion
In the project, an assistive technology that helps blind was developed. The device is produced as a wristwatch, and a camera is placed. A software that gives audible information to the user developed by combining machine learning and object detection/text recognition Python programming language is used. Tensorflow and OpenCV are utilized for object detection. Text recognition is done by Tesseract OCR and text to speech by Espeak.
The life of blind people should be a lot easier with this first of its example system. Also, thanks to having a very low cost, this technology can be reached by anybody, including those with low socioeconomic levels.
- Switching between two modes using a switch is enough for the first stage of this project. However, on the next step, it is planned to make Vision completely multitasking by utilizing a higher capacity and memory computer. If this can be accomplished, the device can tell whatever it sees without needing to be given a mode. For example, the device can tell if there is a text, too, while detecting objects. Thus making the user decide better. Some software upgrades are required for complete simultaneously too. There are two methods for this. First, two software can be run one after the other by combining them in one program. Second, if the computer has multiple cores, two software could be run parallel too.
- The size of the Bilgegöz can be shrunk parallel to the availability of smaller components as the technology develops. The possibility to make this a lot smaller wristwatch relies on the size of the materials. A lot of tinier sizes can be reached as the battery and computer tech improves on size. If components small enough can be possible to obtain, the system in this project even can be fit in a standard wristwatch, making the user feel a lot more normal.
- Bilgegöz’s electronic components are connected to each other using thin wires and soldering methods. However in further stages the custom circuit of the Vision can be developed by creating a PCB design. Thus instead of big physical devices which take up a lot of space -including Raspberry Pi Zero-, tiny little surface mounted (SMD) components or integrated circuits could be used in order to make the device smaller.
- Camera efficiency accuracy rate can be increased by using a camera in higher conditions instead of the low cost webcam.
- Better processors can be used instead of Raspberry Pi Zero to reduce the size of the device and increase processing power and enable software to run at higher speeds. This recommendation is also part of the PCB printing proposal because the processor must be supplied and installed on the circuit in order to reduce to smaller sizes than the Raspberry Pi’s, just like the modern smartphone’s processors.
- With a more efficient battery, the volume can be reduced, and the energy capacity, ie, the operating time, can be increased.
- The sound card can be further improved, and a better quality sound output can be achieved by using noise-canceling integrated circuits.
- If the internet connection of the device can be possible;
- More objects can be identified with more accuracy thanks to the online computer vision systems.
- Face recognition can be added. (Identifying)
- Face expressions and emotions could be detected.
- Weather forecast can be given to the user.
- It can be made possible for every user to add new objects.
- Navigation can be given.
- Meals can be detected, and their nutrition values or the diet of the user can be notified to the user.
- Also, the health values of the user can be tracked using various health sensors.
- Bilgegöz can turn into a “Smart personal assistant,” thus fulfilling the needs of the visually impaired users and make their life easier.
Feasibility and Common Effect
- Vision, has a very simple mechanical and electrical structure yet a powerful software. Therefore it can be produced easily with 3D printers.
- The only stage left for the device to get into serial production in PCB design. After that, Bilgegöz can be completely ready for production.
- Also, an open-source community can be created via sharing the projects 3D design files, circuit schematics, and software. Thus people all around the world can reach and contribute to this project.
- Vision has a very great potential to make a difference in the blind’s life as it fulfills the two tasks that a visually impaired person would most need.
- The Vision is reachable by everybody thanks to its reasonable price.
- A pioneering study was made about the usage of machine learning the life of the blind. It can be said that it is possible for this project to become a model for new similar studies soon.
Awards This Project Has Received
40th Beijing Youth Science Creation Competition (BYSCC) Representative 2020
Beijing Association for Science and Technology / Chosen as the Turkish representative in the 40th Beijing Youth Science Creation Competition which will be held in Beijing, PRC by Beijing Association for Science and Technology in March 2020
Third Place at National TÜBİTAK 2204-A Engineering Design
TÜBİTAK (Scientific and Technological Research Council of Turkey) / Third place winner in Engineering Design at 50th High School Students Research Projects Final Competition
Championship at OKSEF 2019 (Energy&Engineering)
Oğuzhan Özkaya Education / First place in Energy&Engineering category at OKSEF (Oğuzhan Özkaya Education – Karademir Science Energy Engineering Fair) organized in Oğuzhan Özkaya Schools