Converting std::wstring to utf-8 in C++ x11 and writing utf-8 files with fstream

Simple conversion can be done like this:

#include <string>
#include <codecvt>

std::string ws2utf8(std::wstring &input)
{
 std::wstring_convert<std::codecvt_utf8<wchar_t>> utf8conv;
 return utf8conv.to_bytes(input);
}

std::wstring utf82ws(std::string &input)
{
 std::wstring_convert<std::codecvt_utf8<wchar_t>> utf8conv;
 return utf8conv.from_bytes(input);
}

To write std::wstrings as utf8 text files:

#include <string>
#include <iostream>
#include <fstream>
#include <codecvt>

void writeUtf8(std::wstring output, std::string filename)
{
 std::ofstream utf8file(filename);
 std::wbuffer_convert<std::codecvt_utf8<wchar_t>> converter(utf8file.rdbuf());
 std::wostream out(&converter);

 out << output;

 utf8file.close();
}

How much RAM should the JVM of my webserver use?

When trying to estimate an appropriate heap size for a java process there are many variables to consider.

In any modern software system, allowing a single process to dominate the available RAM (or any other limited resource) is likely to incur a significant overhead on the host operating system as it will almost certainly cause resource starvation for every other process.

[http://en.wikipedia.org/wiki/Resource_starvation]

In Java, heap memory size is generally defined when the process starts, (-Xms, -Xmx etc.), the java heap is memory that is used internally by Java objects but it is not the only memory that the process will consume; the interpreter must also have non-heap native memory assigned to run the virtual machine itself and manage class loading and other side effects of a managed runtime.

[https://docs.oracle.com/cd/E13150_01/jrockit_jvm/jrockit/geninfo/diagnos/garbage_collect.html]
[http://javabook.compuware.com/content/memory/how-garbage-collection-works.aspx]

Furthermore, since java applications must also interact with the host operating system for activities such as networking and disk I/O, direct memory buffers will be continuously allocated and de-allocated throughout the lifespan of the process.

[http://www.ibm.com/developerworks/library/j-nativememory-linux/]

Externally to the actual java process, in addition to the fixed overhead of actually running an operating system, modern operating systems will consume memory that is largely driven by workload. One such source of memory consumption that is of significant importance to Jazz applications is file caching that will provide accelerated I/O performance for files. (In Jazz applications on-disk files are frequently read and written to provide rapid search capabilities through indexes that are not stored in the database but are too large to hold directly in RAM.)

[https://msdn.microsoft.com/en-us/library/windows/desktop/aa364218(v=vs.85).aspx]
[https://www.thomas-krenn.com/en/wiki/Linux_Page_Cache_Basics]

Considering these factors you must consider the normal memory conditions of the server and then attempt to determine what is appropriate.

From personal experience and what I’ve learned in the past two years diagnosing and monitoring CLM deployments, if I were planning new CLM deployments I would be even more agressive than IBM advises for java web services in general and suggest the following:
25% of RAM for java heap.
25% for native memory
25% for disk caching
25% for the OS

TL;DR Trying to use more than half of your RAM for java heaps is daft unless you want bad performance, because that’s how you get bad performance.

Configuring the IBM JVM for RTC, RSA and other Eclipse based IDE products

Overview

I have been surprised by how little RTC and RSA users know about customizing their environment. I have found with several development teams that increasing everyone’s familiarity with the eclipse.ini and possible configurations can have a surprising impact on productivity and reduction of downtime. In my experience, the number one cause of “Why is Eclipse so slow?” and “Why does Eclipse crash so much?” is a failure to scale the JVM settings to meet the actual demands.

A useful resource will be the IBM JVM defaults:

http://publib.boulder.ibm.com/infocenter/javasdk/v6r0/index.jsp?topic=%2Fcom.ibm.java.doc.diagnostics.60%2Fdiag%2Fappendixes%2Fdefaults.html

c – The setting is controlled by a command-line parameter only.
e – The setting is controlled by an environment variable only.
ec – The setting is controlled by a command-line parameter or an environment variable. The command-line parameter always takes precedence.

JVM setting AIX® IBM® i Linux Windows z/OS® Setting affected by
Default locale None None None N/A None e
Time to wait before starting plug-in N/A N/A Zero N/A N/A e
Temporary directory /tmp /tmp /tmp c:\temp /tmp e
Plug-in redirection None None None N/A None e
IM switching Disabled Disabled Disabled N/A Disabled e
IM modifiers Disabled Disabled Disabled N/A Disabled e
Thread model N/A N/A N/A N/A Native e
Initial stack size for Java™ Threads 32-bit. Use:-Xiss<size> 2 KB 2 KB 2 KB 2 KB 2 KB c
Maximum stack size for Java Threads 32-bit. Use:-Xss<size> 256 KB 256 KB 256 KB 256 KB 256 KB c
Stack size for OS Threads 32-bit. Use -Xmso<size> 256 KB 256 KB 256 KB 32 KB 256 KB c
Initial stack size for Java Threads 64-bit. Use:-Xiss<size> 2 KB N/A 2 KB 2 KB 2 KB c
Maximum stack size for Java Threads 64-bit. Use:-Xss<size> 512 KB N/A 512 KB 512 KB 512 KB c
Stack size for OS Threads 64-bit. Use -Xmso<size> 256 KB N/A 256 KB 256 KB 256 KB c
Initial heap size. Use -Xms<size> 4 MB 4 MB 4 MB 4 MB 4 MB c
Maximum Java heap size. Use -Xmx<size> Half the available memory with a minimum of 16 MB and a maximum of 512 MB 2 GB Half the available memory with a minimum of 16 MB and a maximum of 512 MB Half the real memory with a minimum of 16 MB and a maximum of 2 GB Half the available memory with a minimum of 16 MB and a maximum of 512 MB c

 

Note that on Linux by default a 32bit JVM will have a maximum heap size of 512MB

Before changing memory settings for RSA or RTC it is important to consider available RAM, IBM’s rule of thumb is that the total configured heap size of all java processes should not consume more than half of available RAM in order to allow sufficent RAM for the operating system itself as well as native memory allocation that may occur. For example, when using networking protocols the operating system and Java are probably allocating byte-buffers (per socket connection in the case of HTTP and other TCP connections). This is especially important to consider when configuring a server as you will see significantly higher memory consumption at the operating system level than what has been configured in Java using -Xmx etc.

Configuring Eclipse

Since both RTC and RSA are built on the Eclipse platform the configuration of the JVM is done via the eclipse.ini

My 4.0.3 test environment has this as an ini, which I think is the default:

-vm
C:\Users\work\usr\RTC_403_20130328\RTC\TeamConcert\jdk\jre\bin\javaw.exe
–launcher.XXMaxPermSize
256m
-startup
plugins/org.eclipse.equinox.launcher_1.1.1.R36x_v20101122_1400.jar
-install
C:\Users\work\usr\RTC_403_20130328\RTC\TeamConcert
–launcher.library
plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.2.R36x_v20101222
-vmargs
-Xms100m
-Xmx512m
-Dosgi.requiredJavaVersion=1.5
-Dosgi.bundlefile.limit=100

What we’re particularly interested in are the vmargs that configure memory: -Xmx and -Xms.\

-Xms
Sets the initial Java heap size.

-Xmx
Sets the maximum memory size for the application (-Xmx >= -Xms).

As you can see the defaults are quite small, even if you only have a 32bit machine with 3GB or 4GB of RAM you can easily run up to 1.5GB. You will find however that if you are running a 32 bit Windows environment that you are probably unable to set -Xmx much larger than 1580m. (But you may find that you are pushing the host to it’s limits, if you actually use that much memory.)

I almost always set -Xmx up to 1g to allow a maximum heap size of one gigabyte. However I have had a handful of plugin development environments that have required me to go up to 1580m to avoid running out of memory when resetting the plugin target to a new version of Eclipse.

Eclipse Heap Status

While running any Eclipse based IDE product you should have the Heap Status Monitor available. It can be enabled in General preferences by checking the “Show heap status” checkbox. With that setting enabled you can see the Java memory status. Hovering the mouse over the widget give you a tooltip with more info, clicking the garbage can icon forces a System.gc(). More options are available with a context menu. I’m a big fan of the ‘Show Max Heap’ option:

Quick review

  • 32 bit default heap size is quite small
  • in Eclipse this is controlled by command line vmargs in eclipse.ini
  • -Xmx1g and -Xmx1580m are good values even on a 32bit OS.
  • you can monitor the java heap in Eclipse by using the ‘Show heap status’ preference

Note that 64 bit versions of both RTC and RSA are available, but not for all versions. (I think 64 bit RTC is 4+ and RSA is 8+)

Other resources

IBM Java 1.6 documentation of memory settings:
http://publib.boulder.ibm.com/infocenter/javasdk/v6r0/index.jsp?topic=%2Fcom.ibm.java.doc.diagnostics.60%2Fdiag%2Fappendixes%2Fcmdline%2Fcommands_gc.html

http://wiki.eclipse.org/FAQ_How_do_I_run_Eclipse%3F
http://wiki.eclipse.org/Eclipse.ini

The Eclipse 3.6 (Helios) command line documentation is here, any of the options listed here can be added to eclipse.ini
http://help.eclipse.org/helios/index.jsp?topic=%2Forg.eclipse.platform.doc.isv%2Freference%2Fmisc%2Fruntime-options.html

I also found a Stackoverflow question that has the most complete list of configuration issues I’ve ever seen though it was started some time ago and doesn’t address large heaps or 64 bit environments.
http://wiki.eclipse.org/FAQ_How_do_I_run_Eclipse%3F

Creating HTML in Java with the DOM

Most of the HTML examples I’ve seen from Java date back to the stone ages, here’s an example of doing it the right way.

package demo;

import java.io.FileOutputStream;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Text;

public class HTMLDemo {

   public static void main(String[] args) {
      FileOutputStream outputStream = null;

      try {
         Document newDocument = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
         Element html = newDocument.createElement("html");
         Element head = newDocument.createElement("head");
         Element body = newDocument.createElement("body");

         Element p = newDocument.createElement("p");
         Text textNode = newDocument.createTextNode("Hello World");

         p.appendChild(textNode);
         body.appendChild(p);

         html.appendChild(head);
         html.appendChild(body);

         newDocument.appendChild(html);

         outputStream = new FileOutputStream("/demo.html");
         TransformerFactory.newInstance().newTransformer().transform(
               new DOMSource(newDocument), new StreamResult(outputStream));         

      } catch (Exception e) {
         e.printStackTrace();
      } finally {
         if (outputStream != null) {
            try {
               outputStream.close();
            } catch (IOException e) {
               e.printStackTrace();
            }
         }
      }
   }
}