Bojan Nikolic: Numerical and Quantitative Methods in C++

[website home] | [BN Algorithms]

Using (or not) Files

In the last section I showed how to load data for computation from the so-called standard input, which is usually provided by the user typing on a keyboard. For data sets of almost any size, it is however much more convenient to store the data in a file on the disk and simply specify that these data be loaded and processed rather than having to retype them.

In this section I will show how this functionality to load data from a file on disk is implemented in C++, and also why some in cases (this one included) this actually is neither necessary nor desirable, that is, it is better not to directly implement way of loading the data from disk.

Alternate program: mean_v7

Here is an alternate version of the mean program, that loads the data from a disk file named meandata.dat in the current directory.

// mean_v7.cpp
// Bojan Nikolic <bojan@bnikolic.co.uk>
// Input from files

#include <iostream>
#include <fstream>

#include <boost/format.hpp>

double csum(const double * data,
    	    size_t n)
{
  double sum =0;
  for (size_t i =0 ; i < n ; ++i )
  {
    sum += data[i];
  }
  return sum;
} 

double cmean( const double * data,
    	      size_t n)
{
  return (csum(data,n)/n);
}

void readInput(double * data ,
    	       const size_t n)
{
  std::fstream ifs("meandata.dat",
		   std::ios_base::in);
  for (size_t i =0 ; i < n; ++i)
  {
    ifs>>data[i];
  }
}

int main(void)
{
  const size_t n = 5;
  double data[n];
  
  readInput(data,n);
  
  std::cout<<
    boost::format("The sum is: %.2f and the mean is: %.2f")  
    % csum(data,n) % cmean(data, n)
	   <<std::endl;
}

At the beginning of the program we have added an include of a new header:

#include <fstream>

this header declares the facilities for dealing with files on disk (the f stands for file).

The only other change to the program is modification to the readInput function where we now open the file and read from it. The most important statement is:

std::fstream ifs("meandata.dat",
                  std::ios_base::in);

This statement declares variable ifs to be a file stream and connects it to a file named meandata.dat. The second parameter to the constructor, in this case, std::ios_base::in, specifies the mode with which to open to files. The mode tells the operating system a few essential pieces of information of how you will be using the file, for example if any translation is required or if you will be writing to the file. In this case, the value std::ios_base::in specifies that the file will be read from, i.e., the stream should be an input stream.

The final change that is simply to replace the use of std::cin with the variable ifs – the interfaces of reading from the standard stream and file streams are very similar so no further changes are necessary.

Why not to read from a file in this case

Although being able to use a file for data input into our program is very attractive, the present implementation does have some obvious shortcomings, e.g.:

  • The file has to be named meandata.dat, and this can’t be changed without recompiling the program
  • Processing a very long sequence of numbers can be problematic since there must be enough space for all of the numbers to be on disk at one time
  • Direct input from the keyboard is no longer possible even for very quick computations

Avoiding explicit file interface

A simple alternative to implementing the read-from-file facility is to make use of native capability of most operating systems to make a file on disk look as the standard input stream. On Unix systems this accomplished simply using the < operator on the command line of most shells:

./mean-6 < meandata.dat
The sum is: 21.00 and the mean is: 4.20

This is possible for this application because we need to access the data strictly sequentially and once each input number has been processed it can be discarded. If we needed to say, move back to the beginning of the file and make a second processing pass this approach would not work.

Given however that it does work, it has a number of advantages:

  • If the user wants to do manual input from keyboard that is easily done
  • As described above, input from a file is almost as easy
  • It is also possible to feed data which is generated from another program without saving it to the disk by using the | shell operator