go to start Ex W8
|home |print view |recent changes |changed November 7, 2018 |
exact
|You are 18.205.19.203 <- set your identity!

Sections: ''Testat-Exercise 2'': KWIC with word class | Container Exercises | KWIC - Keyword in Context (TESTAT) | Automated Checking: | Not Testat | Tallying Words (Application of =[std::map]=) | Algorithms | Checking for Palindromes | Optional | Generating Anagrams | =[unique]= |

Testat-Exercise 2: KWIC with word class ^

Hand in time is Monday Nov 19 2018, 12:00 (noon) (CET)

Hand in all your source files attached to a single email to thomas.corbat@hsr.ch, peter.sommerlad@hsr.ch . NO ZIP, NO object files, NO eclipse project.

Container Exercises ^

This week we are trying to familiarize ourselves with algorithms and containers of the STL. Try to solve the exercises using your word class from the last weeks (ExW5). Show your word class together with your test cases to your exercise supervisor!

Do always start with test cases first and make only your main function use the global stream objects. Separate functionality into functions. If useful, group stuff into namespaces.

KWIC - Keyword in Context (TESTAT) ^

From Parnas [ Parnas72 ] we have a concise definition of the Keyword in Context problem:

The KWIC index system accepts an ordered set of lines, each line is an ordered set of words, and each word is an ordered set of characters. Any line may be "circularly shifted" by repeatedly removing the first word and appending it at the end of the line. The KWIC index system outputs a listing of all circular shifts of all lines in alphabetical order.

Write a program kwic that reads lines from standard input and creates the variations of the line where each word is in front once. Output the stored lines in sorted order, so that you can see, how the words are used in context.

Example input:

 this is a test
 this is another test

result:

 a test this is
 another test this is
 is a test this
 is another test this
 test this is a
 test this is another
 this is a test
 this is another test

Clarifying example input:

 a b c d
 a a b
 b b c

result:

 a a b
 a b a
 a b c d
 b a a
 b b c
 b c b
 b c d a
 c b b
 c d a b
 d a b c

 

Tipps:

Automated Checking: ^

Our platform for automated testat checking will expect you to provide the following functionality:

namespace kwic {
void kwic(std::istream & in, std::ostream & out);
}

Experimental: The service to automatically check your kwic function and word class (functionality, not style) is online. You can try to upload your word.h and word.cpp to https://uploader.alf.infs.ch/ please note that external login (oauth) is not implemented.

Not Testat ^

Tallying Words (Application of std::map) ^

In the previous weeks you have implemented the wlist program with std::string (in ExW4) and with your own Word class (in ExW5). In this exercise you will do something similar. Instead of just listing all words encountered on an std::istream you will now count them and print the occurrences.

Expected function signature:

void word_tally(std::istream & in, std::ostream & out);

Example input:

Wenn hinter Fliegen Fliegen fliegen, fliegen Fliegen Fliegen nach.

Expected result on the output:

Fliegen: 6
hinter: 1
nach: 1
Wenn: 1

Algorithms ^

An introduction into this exercise is given in this weeks video. See self-study part (SsW8).

For using the existing STL algorithms effectively you have to familiarize yourself with available functionality. We have prepared a CUTE test project (algorithm_trivia) with a large set of test cases which require you to insert the correct STL algorithm in order to get the test green. It contains several test suites, each containing some test cases. Usually, those cases fail with the current implementation. We have replaced the original calls with dummy functions (xxx, xxxx, xxxxx, xxxxxx) that satisfy the interface. You don't have to worry about these helper functions. Your task is to fix the test cases by calling the correct STL algorithm (instead of the helper function).

Since there are quite a few algorithms you don't have to solve all algorithms at one. We suggest to solve 2-3 test suites each week. Each suite contains a hint header that lists the algorithms that have to be used in the corresponding suite file. Each algorithm is required once. Read the description of the algorithms first and then try to insert the correct calls. Some algorthims require a predicate. Most of the time we have used an is_prime function that just checks whether the parameter is prime.

Have Fun!

Checking for Palindromes ^

A palindrome is a word or sentence that can be read from its beginning and its end and results in the same word (in our example ignore whitespace). For example, the name "Otto" is a palindrome. Write a predicate is_palindrome(std::string) taking a string and returning if the string is a palindrome (ignoring letter case).

Use that function to find all palindromes in the dictionary file /usr/share/dict/words (on Linux, Unix, or Mac OS). That file contains one word per line, so you can filter it with your predicate without storing all the words.

/usr/share/dict/words file for ArchLinux-VM
This file is not installed in the current VM. To install it, issue the following command in a Terminal: sudo pacman -S words. Alternatively, download a copy from here: http://www.cs.duke.edu/~ola/ap/linuxwords

Optional ^

Generating Anagrams ^

Write a program that reads a word from standard input and creates all possible anagrams (permutations of the letters in the word).

Use a data structure to collect the anagrams that keeps them in sorted order and eliminates duplicates.

On Linux/Mac/Unix you can read in the file /usr/share/dict/words into a data structure and filter your anagrams according to the valid words. Try this with short words first, otherwise generating the permutations might take a long time (factorial(size())).

How many anagrams forming a word according to your system's dictionary do you find for the word "listen" ?

Optional: Can you extend your program in a way that it also will check for valid anagrams consisting of multiple words, i.e., "Vacation time" = "I am not active" (might be a bit harder and slower).

unique ^

Create the unique unix command line tool, that filters subsequent identical lines.


|home |print view |recent changes |changed November 7, 2018 |
exact
|You are 18.205.19.203 <- set your identity!

Ex W8
go to start