Characters, Unicode and Encoding

Handling Non-ASCII User Input

How do I handle user input that might contain non-ASCII characters?

Abstract art representing computer programming

Handling non-ASCII user input in C++ requires careful consideration of character encodings and the use of appropriate data types. Here's a guide on how to approach this:

Use Wide Character Types

When dealing with non-ASCII input, it's often easier to use wide character types like wchar_t or char16_t. These can represent a wider range of characters than the basic char type.

#include <iostream>
#include <string>

int main() {
  std::wstring input;
  std::wcout << L"Enter some text "
    L"(including non-ASCII characters): ";
  std::getline(std::wcin, input);
  std::wcout
    << L"You entered: " << input << L'\n';
}
Enter some text (including non-ASCII characters): 123 ❤️
You entered: 123 ❤️

Set the Locale

To properly handle non-ASCII input and output, you may need to set the appropriate locale:

#include <iostream>
#include <string>
#include <locale>

int main() {
  std::locale::global(std::locale(""));
  std::wcout.imbue(std::locale());
  std::wcin.imbue(std::locale());

  std::wstring input;
  std::wcout << L"Enter some text (including "
                L"non-ASCII characters): ";
  std::getline(std::wcin, input);
  std::wcout << L"You entered: " << input;
}
Enter some text (including non-ASCII characters): 123 ❤️
You entered: 123 ❤️

Use UTF-8 Encoding

If you prefer to work with UTF-8 encoding (which is becoming increasingly common), you can use the regular std::string type, but you need to ensure your environment is set up correctly:

#ifdef _WIN32
  #include <windows.h>
#endif

#include <iostream>
#include <string>

int main() {
#ifdef _WIN32
  SetConsoleOutputCP(CP_UTF8);
  SetConsoleCP(CP_UTF8);
#endif

  std::string input;
  std::cout << "Enter some text (including  "
               "non-ASCII characters): ";
  std::getline(std::cin, input);
  std::cout << "You entered: " << input << '\n';
}
Enter some text (including non-ASCII characters): 123 ❤️
You entered: 123 ❤️

Remember, when working with non-ASCII input, it's crucial to be consistent with your encoding throughout your program. Mixing different encodings can lead to unexpected results and display issues.

Also, be aware that the behavior of these examples may vary depending on your operating system, compiler, and terminal settings. Always test your program with a variety of inputs to ensure it handles non-ASCII characters correctly.

This Question is from the Lesson:

Characters, Unicode and Encoding

An introduction to C++ character types, the Unicode standard, character encoding, and C-style strings

Answers to questions are automatically generated and may not have been reviewed.

This Question is from the Lesson:

Characters, Unicode and Encoding

An introduction to C++ character types, the Unicode standard, character encoding, and C-style strings

A computer programmer
Part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, unlimited access

This course includes:

  • 124 Lessons
  • 550+ Code Samples
  • 96% Positive Reviews
  • Regularly Updated
  • Help and FAQ
Free, Unlimited Access

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Screenshot from Warhammer: Total War
Screenshot from Tomb Raider
Screenshot from Jedi: Fallen Order
Contact|Privacy Policy|Terms of Use
Copyright © 2024 - All Rights Reserved