go to start Ex W8
|home |print view |recent changes |changed November 15, 2017 |
exact
|You are 54.81.139.56 <- set your identity!

Sections: ''Testat-Exercise 2'': KWIC with word class | Container Exercises | 0 KWIC - Keyword in Context (TESTAT) | Automated Checking: | 1 Tallying Words (Application of =[std::map]=) - Not Testat | 2 Checking for Palindromes - Not Testat | 3 Generating Anagrams - Not Testat |

Testat-Exercise 2: KWIC with word class ^

Hand in time is Monday Nov 20 2017, 12:00 (noon) (CET)

Hand in all your source files attached to a single email to your exercises supervisor and CC: peter.sommerlad@hsr.ch . NO ZIP, NO object files, NO eclipse project.

Container Exercises ^

This week we are trying to familiarize ourselves with algorithms and containers of the STL. Try to solve the exercises using your word class from the last weeks (ExW5). Show your word class together with your test cases to your exercise supervisor!

Do always start with test cases first and make only your main function use the global stream objects. Separate functionality into functions. If useful, group stuff into namespaces.

0 KWIC - Keyword in Context (TESTAT) ^

From Parnas [ Parnas72 ] we have a concise definition of the Keyword in Context problem:

The KWIC index system accepts an ordered set of lines, each line is an ordered set of words, and each word is an ordered set of characters. Any line may be "circularly shifted" by repeatedly removing the first word and appending it at the end of the line. The KWIC index system outputs a listing of all circular shifts of all lines in alphabetical order.

Write a program kwic that reads lines from standard input and creates the variations of the line where each word is in front once. Output the stored lines in sorted order, so that you can see, how the words are used in context.

Example input:

 this is a test
 this is another test

result:

 a test this is
 another test this is
 is a test this
 is another test this
 test this is a
 test this is another
 this is a test
 this is another test

Clarifying example input:

 a b c d
 a a b
 b b c

result:

 a a b
 a b a
 a b c d
 b a a
 b b c
 b c b
 b c d a
 c b b
 c d a b
 d a b c

 

Tipps:

Automated Checking: ^

Our platform for automated testat checking will expect you to provide the following functionality:

namespace kwic {
void kwic(std::istream & in, std::ostream & out);
}

Experimental for the eager and brave! The service to automatically check your kwic function and word class (functionality, not style) is online. You can try to upload your word.h and word.cpp to https://alf-file-uploader.infs.ch/ please note that external login (oauth) is not implemented.

1 Tallying Words (Application of std::map) - Not Testat ^

In the previous weeks you have implemented the wlist program with std::string (in ExW4) and with your own Word class (in ExW5). In this exercise you will do something similar. Instead of just listing all words encountered on an std::istream you will now count them and print the occurrences.

Expected function signature:

void word_tally(std::istream & in, std::ostream & out);

Example input:

Wenn hinter Fliegen Fliegen fliegen, fliegen Fliegen Fliegen nach.

Expected result on the output:

Fliegen: 6
hinter: 1
nach: 1
Wenn: 1

2 Checking for Palindromes - Not Testat ^

A palindrome is a word or sentence that can be read from its beginning and its end and results in the same word (in our example ignore whitespace). For example, the name "Otto" is a palindrome. Write a predicate is_palindrome(std::string) taking a string and returning if the string is a palindrome (ignoring letter case).

Use that function to find all palindromes in the dictionary file /usr/share/dict/words (on Linux, Unix, or Mac OS). That file contains one word per line, so you can filter it with your predicate without storing all the words.

/usr/share/dict/words file for ArchLinux-VM
This file is not installed in the current VM. To install it, issue the following command in a Terminal: sudo pacman -S words. Alternatively, download a copy from here: http://www.cs.duke.edu/~ola/ap/linuxwords

3 Generating Anagrams - Not Testat ^

Write a program that reads a word from standard input and creates all possible anagrams (permutations of the letters in the word).

Use a data structure to collect the anagrams that keeps them in sorted order and eliminates duplicates.

On Linux/Mac/Unix you can read in the file /usr/share/dict/words into a data structure and filter your anagrams according to the valid words. Try this with short words first, otherwise generating the permutations might take a long time (factorial(size())).

How many anagrams forming a word according to your system's dictionary do you find for the word "listen" ?

Optional: Can you extend your program in a way that it also will check for valid anagrams consisting of multiple words, i.e., "Vacation time" = "I am not active" (might be a bit harder and slower).


|home |print view |recent changes |changed November 15, 2017 |
exact
|You are 54.81.139.56 <- set your identity!

Ex W8
go to start