An idea for archiving. Good or bad ?

Find and share HowTos to various installations / configurations!
12 posts • Page 1 of 2
fmulder
Posts:330
Joined: Wed Feb 03, 2010 9:46 am

An idea for archiving. Good or bad ?

Post by fmulder »

You probably all know the challenge for archiving. we don't want them defined too big because this sounds inefficient and we don't want them to be too small because then there's a change that it will make a new archive file faster than expected. The archive settings define that N files will be kept on disk so when you're archives switch faster then expected, then you'll have less days then expected.

Example:
You choose a file and expect that it is big enough for one day. You configure for 30 files to be kept. Unfortunately, the process changes more than you expected and the we produce 3 archives per day. Now you'll just have 10 days of data instead of 30 ! (and the customer will be disappointed)

I'd like to share an idea and would like to know your opinion.

* choose an archive size where you expect that it will be enough for one day
* configure the archives to hold 1000 files
* Make a datapoint where you specify number of days that you want to keep
* Make a script on the server that checks the archive files once a day. Let the script explicitly archive the files that older than your configured max-days.

So just like you can archive or restore an archive file manually, my plan would be to make a script that does the same automatically for old files.

anyone ever thought about this idea ? Good or bad ?

share the fun
Frenk Mulder

RudiKreiner
Posts:198
Joined: Mon May 16, 2011 2:10 pm

Re: An idea for archiving. Good or bad ?

Post by RudiKreiner »

We have implemented something very similar to this.
Our script regularly checks the free space on the hard disc partition (we have a separate partition for the database) and only deletes the oldes file(s) when there is no longer enough space (with some reserve) for new file switches.
We also have first level compression activated so that only the current files (and the first switch file until it gets compressed) are "big", so that if only a few datapoints are changing fast we don't waste too much disk space.
There is no guarantee how many days data will be stored, but this method makes optimal use of the hard disk space available.

leoknipp
Posts:2928
Joined: Tue Aug 24, 2010 7:28 pm

Re: An idea for archiving. Good or bad ?

Post by leoknipp »

@Frenk: When you have configured your archive for daily fileswitches and you get now more files per day you should check why you get more values than expected.
This is probably the better approach than to change the archive configuration to store more files.

Why do you want to archive the files which are outside of the time range?
If the files are not needed anymore you can simply delete them.


@Rudi Kreiner: Normally the time range which is needed online in the system is defined by the requirements of the project and not by the disk space available.
E.g. if unsolicited values are needed for two months it is not necessary to store them for a longer time.

At your approach you should take into account that using a very high number of archive files has some effects, e.g. increased startup time of the archive process. Therefore the number of files required shall not be higher than defined by project requirements.
When starting the archive manager a check has to be made for every file which exists in the VA directory.

Best Regards
Leopold Knipp
Senior Support Specialist

RudiKreiner
Posts:198
Joined: Mon May 16, 2011 2:10 pm

Re: An idea for archiving. Good or bad ?

Post by RudiKreiner »

Thanks for your comments. I admit that our approach is more pragmatic than deterministic but it has well suited our requirements and those of our customers for many years now.
With our configuration the partition fills up after about 90 days, after which we start deleting the oldest files. It is not tragic though if we have only 60 or 70 days.
Other applications may have other requirements so I do not claim that our approach is the only right way to do it.

The main reason for this approach was that it is very difficult to predict the number of value changes, which determine the number of file switches, since it depends on many factors like the type of product being produced,
possible abnormal behaviour of sensors (which may be as trivial as a loose proximity switch that blinks at a high rate) or the state of maintenance of the plant.

Starting the project does take a few minutes when there are very many files (actually only about 2 minutes for 5000 files since we use fast processors and SSDs)
but, after commissioning, the projects hardly ever get stopped, usually only at major shutdowns during holidays where that delay is not critical.

fmulder
Posts:330
Joined: Wed Feb 03, 2010 9:46 am

Re: An idea for archiving. Good or bad ?

Post by fmulder »

I agree with Rudi. "The main reason for this approach was that it is very difficult to predict the number of value changes, which determine the number of file switches".

I've seen several projects where we really tried to do a good design and still found that the files switched faster than we expected. The approach that Rudi described is more drastic that what we need. We need to guarantee that we hold 2 months of data. E.g. I hope that my disk will only have to hold 60 files.
With the script I can guarantee that it will stil hold 60 days even when a file switches 3 times on one day

Rudi. Thanks for the feedback

share the fun
Frenk Mulder

kilianvp
Posts:443
Joined: Fri Jan 16, 2015 10:29 am

Re: An idea for archiving. Good or bad ?

Post by kilianvp »

I wrote a script that automaticly swaps the archive files out and zips them.

nmnogueira
Posts:125
Joined: Thu May 05, 2011 12:59 pm

Re: An idea for archiving. Good or bad ?

Post by nmnogueira »

I can relate to this. We have several projects where the actual data retaining time is lowered occasionally due to an unexpected number of value changes.
But the client usually wants to know for how long the data is archived, regardless of the number of files.

One solution is to apply archive filters, but this does not solve every problem...

IMHO, there should be an effort in the future to optimize the archiving structure of WinCC OA by making it more dynamic, maybe allowing a DPE to store its values in the archive space which is allocated for other DPEs...

fmulder
Posts:330
Joined: Wed Feb 03, 2010 9:46 am

Re: An idea for archiving. Good or bad ?

Post by fmulder »

Kilian, would you be willing to share your script ?
Can you also explain why you would zip the archives. Do you just do this to save diskspace.

I would plan to just 'backup' the acrhives to a share on my NAS. The customer can then manually retrieve an archive form that (zipping it makes that impossible)

Thanks for all your help

share the fun
Frenk Mulder

RudiKreiner
Posts:198
Joined: Mon May 16, 2011 2:10 pm

Re: An idea for archiving. Good or bad ?

Post by RudiKreiner »

Zipping the files really saves a lot of disk space, especially if only one or a few of the configured datapoints change a lot.
Our zipped files are typically 100 to 200 times smaller than the unzipped ones.

kilianvp
Posts:443
Joined: Fri Jan 16, 2015 10:29 am

Re: An idea for archiving. Good or bad ?

Post by kilianvp »

Frenk Mulder wrote:
Kilian, would you be willing to share your script ?
Can you also explain why you would zip the archives. Do you just do this to save diskspace.

I would plan to just 'backup' the acrhives to a share on my NAS. The customer can then manually retrieve an archive form that (zipping it makes that impossible)

Thanks for all your help

share the fun
Frenk Mulder
Attachments

[The extension ctl has been deactivated and can no longer be displayed.]

[The extension ctl has been deactivated and can no longer be displayed.]


12 posts • Page 1 of 2