Monday, January 28, 2013

Character Arrays in C++

This tutorial will show you how to use char (character) arrays in c++. In case you don't know, in the coding world there are strings of text, e.g.:

"Hello world!"

and these strings can be divided into arrays of characters:

'H' + 'e' + 'l' + 'l' + 'o' + ' ' + 'w' + 'o' + 'r' + 'l' + 'd' + '!'

Each character within a string, file, network packet, or etc. can easily be accessed individually if you use a character array:

charArray[0] == 'H'
charArray[1] == 'e'
etc...

In this tutorial, we will be making a program that will read an input file (which is expected to be a series of phone numbers separated by commas), go through each character in the file and make some changes, and then output the result to another file.

Before I get anywhere, let me explain the details a bit. We're expecting a format similar to that outputted from my previous tutorial (CSV format):

245-984-6445,4968396784,3463456734,470-484-7209
2459846445,496-839-6784,13463456734,4704847209
245-984-6445,14968396784,3463456734,4704847209
2459846445,4968396784,13463456734,14704847209

We're also expecting each column of each row to be 10-11 digits, where the 11 digit numbers only have 11 digits because they start with a "1" in the front (e.g. 14968396784). We're also expecting that some of the numbers can have a dash ("-") character separating the 3 different parts of the phone number.

The goal of our program is to:
  1. Open the file "input.txt" (that's in the same folder as the program).
  2. Remove all dash characters (245-984-6445 -> 2459846445).
  3. Remove all of the 1's at the beginning of numbers that have them (14968396784 -> 4968396784).
  4. Place a comma after the first 3 digits in each number (4968396784 -> 496,8396784).
  5. Save the result to "output.txt".

This is what our program will look like once finished:

and here are some useful links for you in case you want them:

Code file ("main.cpp"): http://thecodingwebsite.com/tutorials/phonenumberformatter/main.cpp
Code file as a ".txt" file ("main.txt") so you can view it in your browser: http://thecodingwebsite.com/tutorials/phonenumberformatter/main.txt
Finished program: http://thecodingwebsite.com/tutorials/phonenumberformatter/PhoneNumberFormatter.exe
Sample input file ("input.txt"): http://thecodingwebsite.com/tutorials/phonenumberformatter/input.txt
Sample output file ("output.txt") (what the output should look like after running the program): http://thecodingwebsite.com/tutorials/phonenumberformatter/output.txt


Note: from this point forward, I'm going to assume you have some of the basic knowledge covered in my previous tutorial to avoid redundancy.


I'm going to start with some basic code structure from the previous tutorial, with an additional "#include <string>" for usage of string manipulation in our program:
#include <iostream>
#include <fstream>
#include <string>

using namespace std;

void main()
{
 cout << "Formatting...\n\n";

 fstream fin("input.txt");
 fstream fout("output.txt", ios::out);

 if (fin && fout)
 {
  
 }
 else
 {
  cout << "Error either opening \"input.txt\" for reading or opening \"output.txt\" for writing.";
 }

 fout.close();
 fin.close();

 cout << "\n\n";

 system("pause");
}
Next, we need to read the entire file in by reading each line as a whole and adding it to the end of the "fileContents" string:
if (fin && fout)
 {
  char nextLine[50000];
  string fileContents = "";

  fin.getline(nextLine, 50000);

  //Read the entire file into the fileContents string.
  while (fin)
  {
   fileContents += nextLine;

   fin.getline(nextLine, 50000);

   if (fin)
   {
    fileContents += "\n";
   }
  }
  
  int fileLength = strlen(fileContents.c_str());
Directly below that code (still in the same "if" statement), we'll loop through every character from the input file:
string fileOutputContents = "";

  int digitNum = 0;

  for (int i = 0; i < fileLength; ++i)
  {
   //Completely skip the dashes.
   if (fileContents[i] != '-')
   {
    //If it's a new column, reset the digit count and output the new line or comma character like normal.
    if (fileContents[i] == '\n' || fileContents[i] == ',')
    {
     digitNum = 0;
     fileOutputContents += fileContents[i];
    }
    else
    {
     //We're assuming it's a digit now.

     //Check to make sure that this either isn't the first digit in the number or that it's not a 1.
     if (digitNum > 0 || fileContents[i] != '1')
     {
      //If so, output the character.
      fileOutputContents += fileContents[i];

      //Check to see if it's the third digit in the number; if so, output a comma.
      if (digitNum == 2)
      {
       fileOutputContents += ',';
      }

      //Increase the digit counter.
      ++digitNum;
     }
    }
   }
  }
If it's a dash, we'll skip it immediately and continue on to the next character.

Each number we encounter will increase an integer called "digitNum" (the number of digits into the current number we're at). Then whenever we reach a new line character ('\n') or a comma, the digitNum value will reset back to 0 because it's a new number (and the new line character and/or comma will be outputted). This way we can keep track of the current digit for the next part.

Next, once we can assume the character is a digit (since it's not a dash, new line character, or comma), we'll need to check to see if it's the first digit and if it's a 1. If it's a 1 at the beginning of the number, we'll skip past it just like we did with the dash earlier. If it's not a 1 or if it's not the first digit in the number, we can output that character to the file.

Also, we check to see if it's the third digit in the number (by comparing it to 2, since we started counting digits in each number at 0). If it is, we'll output a comma immediately after the digit. Then we increase the digit counter as mentioned earlier.

Lastly, we output the contents of the result string ("fileOutputContents") to the output file ("output.txt") using our variable "fout". Since the entire file's contents are stored in the string, we can do it in one line of code:
fout << fileOutputContents;
...and we're done! Here's the final code:
#include <iostream>
#include <fstream>
#include <string>

using namespace std;

void main()
{
 cout << "Formatting...\n\n";

 fstream fin("input.txt");
 fstream fout("output.txt", ios::out);

 if (fin && fout)
 {
  char nextLine[50000];
  string fileContents = "";

  fin.getline(nextLine, 50000);

  //Read the entire file into the fileContents string.
  while (fin)
  {
   fileContents += nextLine;

   fin.getline(nextLine, 50000);

   if (fin)
   {
    fileContents += "\n";
   }
  }
  
  int fileLength = strlen(fileContents.c_str());

  string fileOutputContents = "";

  int digitNum = 0;

  for (int i = 0; i < fileLength; ++i)
  {
   //Completely skip the dashes.
   if (fileContents[i] != '-')
   {
    //If it's a new column, reset the digit count and output the new line or comma character like normal.
    if (fileContents[i] == '\n' || fileContents[i] == ',')
    {
     digitNum = 0;
     fileOutputContents += fileContents[i];
    }
    else
    {
     //We're assuming it's a digit now.

     //Check to make sure that this either isn't the first digit in the number or that it's not a 1.
     if (digitNum > 0 || fileContents[i] != '1')
     {
      //If so, output the character.
      fileOutputContents += fileContents[i];

      //Check to see if it's the third digit in the number; if so, output a comma.
      if (digitNum == 2)
      {
       fileOutputContents += ',';
      }

      //Increase the digit counter.
      ++digitNum;
     }
    }
   }
  }

  fout << fileOutputContents;

  cout << "Phone numbers formatted successfully! Check \"output.txt\".";
 }
 else
 {
  cout << "Error either opening \"input.txt\" for reading or opening \"output.txt\" for writing.";
 }

 fout.close();
 fin.close();

 cout << "\n\n";

 system("pause");
}
Hopefully this tutorial has helped you learn the usage of strings, character arrays, and characters in general better. There are millions of applications that strings and characters can be applied to.

1 comment:

  1. I didn't find this out until after completing the program and the tutorial, but there's actually a better way to get lines using the "getline" function (without ".fin") in front.

    ReplyDelete