Skip to main content

Raymii.org Logo (IEC resistor symbol) logo

Quis custodiet ipsos custodes?
Home | About | All pages | RSS Feed | Gopher

std::string to lowercase or uppercase in C++

Published: 07-11-2019 | Author: Remy van Elst | Text only version of this article


Table of Contents


I'm using codewars to practice my development skills. Today I learned a method to transform a std::string's casing, either to uppercase or lowercase. It uses a lambda and does loop over all characters in the string. Researching it further, I also found out how to do unicode strings with Boost. This article also includes a mini howto on installing Boost on Windows 10 via mingw for use with CLion.

If you like this article, consider sponsoring me by trying out a Digital Ocean VPS. With this link you'll get $100 credit for 60 days). (referral link)

Case transformation for ascii

The codewars assignment was to count unique lowercase characters in a string, then return the character which was found the most. For the string "hello" this would be l since it's found twice. To do this I first needed to convert the string to lowercase. This is the code I used to lowercase the string for the codewars practice:

int main() {
    std::string inStr = "UPPERCASE";
    std::transform(inStr.begin(), inStr.end(), inStr.begin(),
                   [](unsigned char c){ return std::tolower(c); });
    std::cout << inStr << std::endl;
    return 0;
}

Example output:

#C:\Users\Remy\CLionProjects\test1\cmake-build-debug\src\example.exe
uppercase

For uppercase:

int main() {
    std::string inStr = "lowercase";
    std::transform(inStr.begin(), inStr.end(), inStr.begin(),
                   [](unsigned char c){ return std::toupper(c); });
    std::cout << inStr << std::endl;
    return 0;
}

Example output:

#C:\Users\Remy\CLionProjects\test1\cmake-build-debug\src\example.exe
LOWERCASE

Non-ASCII

Remember: everytime you assume text is ASCII, a kitten dies somewhere.

The code above does not work with Emoji's:

std::string inStr = "\U0001F4A9 ";
std::transform(inStr.begin(), inStr.end(), inStr.begin(),
               [](unsigned char c){ return std::tolower(c); });
std::cout << inStr << std::endl;

This won't give the expected result. I'm using an image since your browser will probably not render this correctly:

img

A unicode string like a common german word will also not work, same kind of weird output.

But, with Boost and ICU you can get this to work. The setup is difficult, but when you have it compiling and working, it's a pleasure to work with. You can just pass entire strings instead of looping over every character.

Boost

Boost is a set of libraries for C++ development, of which most end up in the standard library after a few years.

To include Boost in your cmake project, either install it with your package manager or download it manually.

Installing Boost on Windows 10 or Ubuntu

On Ubuntu 18.04 it's as simple as:

apt-get install libboost-dev-all

TL;DR: On Windows 10 use this mingw build or be warned. Here be dragons.

It cost me multiple hours of troubleshooting and debugging. Appearantly mingw and Boost on Windows are not the best of friends. Especially not if you also need Locale, then libICU is required as well. If you use Visual Studio and MSVC or CLion with MSVC instead of Mingw it all should be less problematic. libICU provides downloads for MSVC, for MinGW you're on your own, good luck with compiling.

Open a cmd, navigate to the folder and build Boost. If you have visual studio installed you can use that, I use mingw so I have to specify that and I have to run a mingw cmd prompt (via the .bat file provided by mingw). Make sure to have g++ available as command:

C:\Users\Remy\Downloads\boost_1_71_0\boost_1_71_0>g++ --version
g++ (i686-posix-sjlj, built by strawberryperl.com project) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Bootstrap:

C:\Users\Remy\Downloads\boost_1_71_0\boost_1_71_0>bootstrap.bat gcc
Building Boost.Build engine

Generating Boost.Build configuration in project-config.jam for msvc...

Bootstrapping is done. To build, run:

    .\b2
[...]

Build:

b2 toolset=gcc 

[lots and lots of compiling later]
    1 file(s) copied.
...failed updating 666 targets...
...skipped 204 targets...
...updated 1573 targets...

Install:

b2 toolset=gcc install

This will install into C:\Boost and the findBoost cmake package will detect it. If you specify a different folder, you need to set BOOST_ROOT as environment variable or pass it to cmake.

In your CMakeLists.txt file the following options might help with debugging if you get errors:

set (Boost_DEBUG ON)
set (Boost_ARCHITECTURE "-x32")
set (Boost_USE_STATIC_LIBS ON)
set (Boost_USE_MULTITHREADED ON)
set (Boost_DETAILED_FAILURE_MSG ON)

Do note that I spent a few hours fiddling and trying to get the boost.Locale library to compile. I ended with a linker error:

C:/PROGRA~2/MINGW-~1/I686-8~1.0-P/mingw32/bin/../lib/gcc/i686-w64-mingw32/8.1.0/
../../../../i686-w64-mingw32/lib/../lib/libiconv.a(localcharset.o):localcharset.c
:(.text+0x73): undefined reference to `_imp__GetACP@0'

Due to not having libicu installed. As stated earlier, I gave up due to ICU as far as I could find only providing MSVC compatible builds, not MinGW builds.

Continue on with this guide on a Linux system if you want to follow along, or use CLion with MSVC instead of MinGW.

Update after another few hours of debugging: when using this build of mingw by Stephan T. Lavavej, the code and cmake does compile and link without errors.

Boost in your CMakeLists file

If you've followed my setup guide for CMake then you should add this to the main root-folder CMakeLists.txt file right before include_directories:

find_package(Boost REQUIRED COMPONENTS locale)
if(Boost_FOUND)
    include_directories(${Boost_INCLUDE_DIR})
    message("-- Boost found: ${Boost_VERSION_STRING}")
else()
    message(FATAL_ERROR "Boost not found!")
endif()

In the src/CMakeLists.txt file, add the following at the bottom:

if(Boost_FOUND)
    target_link_libraries (${BINARY} ${Boost_LIBRARIES})
    message("-- Boost link to: ${Boost_VERSION_STRING}")
else()
    message(FATAL_ERROR "Boost not found!")
endif()

If all went well, your CMake output should include the two new messages:

-- Boost found: 1.71.0
-- Boost link to: 1.71.0

-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/Remy/CLionProjects/test1/cmake-build-debug

Boost locale conversion code

This is the code I used with Boost to convert uppercase to lowercase:

boost::locale::generator gen;
std::locale loc=gen("");
std::locale::global(loc);
std::cout.imbue(loc);
std::string grussen = "grussEN";
std::string poopla = "\U0001F4A9";
std::cout   <<"Upper "<< boost::locale::to_upper(grussen)  << std::endl
            <<"Lower "<< boost::locale::to_lower(grussen)  << std::endl
            <<"Title "<< boost::locale::to_title(grussen)  << std::endl
            <<"Fold  "<< boost::locale::fold_case(grussen) << std::endl
            <<"Poop  "<< boost::locale::to_lower(poopla)   << std::endl;

It's mostly example code from Boost.

My static site generator doesn't like the german S and U, it will not render it correctly. Here's a picture of the code:

boost1

The result works as you would expect:

boost2

Tags: boost , c++ , codewars , cpp , development , mingw , snippets , windows