The Grey Labyrinth is a collection of puzzles, riddles, mind games, paradoxes and other intellectually challenging diversions. Related topics: puzzle games, logic puzzles, lateral thinking puzzles, philosophy, mind benders, brain teasers, word problems, conundrums, 3d puzzles, spatial reasoning, intelligence tests, mathematical diversions, paradoxes, physics problems, reasoning, math, science.

   
The Grey Labyrinth Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups    RegisterRegister  
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Rolling estimate

 
Reply to topic    The Grey Labyrinth Forum Index -> Science, Art, and Culture
View previous topic :: View next topic  
Author Message
Jack_Ian
Big Endian



PostPosted: Thu Jun 30, 2011 2:35 pm    Post subject: 1 Reply with quote

I'm hoping some of you maths-heads can help with this problem.

I have to make an estimate of the number of pages required to print some text. I can make some calculations based on the characters used, but I would like to improve my estimate over time based on feedback from actual pages required.

What would be the appropriate model to use for this?

For example, I guess 100 pages, actual result 110 pages.
Next time I guess 10 pages, but adjust it to 11 pages based on previous history. Actual result 12 pages.
Next time I guess 100 pages, but adjust to 115, etc.

I could figure out some sort of weighted result, but I assume there's a known way for doing this sort of thing.
Back to top
View user's profile Send private message
Zag
Unintentionally offensive old coot



PostPosted: Thu Jun 30, 2011 3:00 pm    Post subject: 2 Reply with quote

There is an additional problem in that your knowledge of the feedback process will start to affect your guesses. Secretly, you'll think that your guessing has gotten better, but, since you know you are going to be applying feedback based on your earlier, less-informed guesses, then you will try to make the guess that you would have made before you got better at it. You'll either over or under compensate, adding a new variable (i.e. more chaos) to the process. Since it's a feedback system anyway, just let the feedback go into your guessing process -- this is something that human brains are very good at!

Better idea: Just put Word into Page View, scroll to the bottom, and see what page number you're on. Just sayin'
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger
Jack_Ian
Big Endian



PostPosted: Thu Jun 30, 2011 3:32 pm    Post subject: 3 Reply with quote

I can calculate the result reasonably accurately, but some of the more complex features such as kerning and alternate glyphs are being ignored.
I suspect my estimate is biased and I was hoping to remove this bias over time, but wasn't sure of the best way of doing this.
Back to top
View user's profile Send private message
Jack_Ian
Big Endian



PostPosted: Thu Jun 30, 2011 4:24 pm    Post subject: 4 Reply with quote

I was going to use something like:

Initialise values as follows:
CurrentBiasValue = 0.0
NumberOfResultsTaken = 0

Each time I calculate an estimate:

AdjustedEstimate = UnAdjustedEstimate x (1 + CurrentBiasValue)

NewBiasValue = CurrentBiasValue + (ActualPagesRequired - UnAdjustedEstimate)/(UnAdjustedEstimate * (NumberOfResultsTaken + 1))

CurrentBiasValue = NewBiasValue
NumberOfResultsTaken = NumberOfResultsTaken + 1

Does that look OK?
Is there a better way?
Back to top
View user's profile Send private message
Nsof
Daedalian Member



PostPosted: Sat Jul 02, 2011 11:33 am    Post subject: 5 Reply with quote

Just to verify the following is correct: each sample is independent of the previous ones (seems reasonable based on the context and discussion).

The method proposed assumes there is a linear correlation between actual values and the approximated values. While its a good start for estimation - is it correct? If you see that estimates of 10, 100 and 1000 pages all missed actual by 10 then probably the estimate is not linearly correlated with the actual. Another example is when the errors vary in a cyclic way (imagine error values changing in a sin wave or even simply -1,1,-1,1,...). A linear estimation will converge to zero, as it should, but it could be replaced with something better.
A scatter plot between ActualPagesRequired and UnAdjustedEstimate can help visualize the relation.

Also, the method presented attempts to minimize the absolute error. Minimizing the error and minimizing the associated cost might not be the same. For example you might prefer to miss upwards rather than downwards. What is the cost/target function?

Last, perhaps it is better to handle the estimation function directly rather than treating it as a black box to be fixed after the fact. Might not be an option depending on the case.
_________________
Will sell this place for beer
Back to top
View user's profile Send private message
Jack_Ian
Big Endian



PostPosted: Mon Jul 04, 2011 11:39 am    Post subject: 6 Reply with quote

The scenario is something similar to the following:
The user chooses online, items that should be printed. There is a running total of approximate pages required. Once complete, a PDF is created and the actual pages are known. Generating the PDF can take a while, and the user might choose to go back and edit the choices afterwards. I wanted to minimise the need to go back.
The current version has no error correction, but I am keeping track of the error at the PDF creation time. I currently over-estimate the pages required, by about 7%. At some point in the future I will probably need to make an adjustment, but it's more likely that the users will get used the bias and make choices accordingly.
Back to top
View user's profile Send private message
Nsof
Daedalian Member



PostPosted: Mon Jul 04, 2011 3:23 pm    Post subject: 7 Reply with quote

Jack_Ian wrote:
The scenario is something similar to the following:
The user chooses online, items that should be printed. There is a running total of approximate pages required. Once complete, a PDF is created and the actual pages are known. Generating the PDF can take a while, and the user might choose to go back and edit the choices afterwards. I wanted to minimise the need to go back.
I dont understand the relation between the number of pages and needing to go back. (e.g. do they go back if the number of pages was too off?).
How much time?
If the same user changes some settings and re-runs the process then total pages length can probably be better estimated based on previous runs in same 'user context'.(samples are not independent.) Might prevent a need for a third run?
_________________
Will sell this place for beer
Back to top
View user's profile Send private message
Jack_Ian
Big Endian



PostPosted: Tue Jul 05, 2011 2:21 pm    Post subject: 8 Reply with quote

They pay by the page. If they go over then they remove something. If they have more room then the add something.
Back to top
View user's profile Send private message
Nsof
Daedalian Member



PostPosted: Tue Jul 05, 2011 8:45 pm    Post subject: 9 Reply with quote

Not sure if all the requirements of the Central Limit Theorem are met here (using the errors=actual-estimate as the samples) but if they are AND you plan to over/undershoot the exact estimate (in order to be on some safe side) then you can provide a tighter estimate when users choose a large number of items.

Next idea is to use a time machine to sneak a pick at the final pdf size Revenge most foul!
_________________
Will sell this place for beer
Back to top
View user's profile Send private message
Display posts from previous: by   
Reply to topic    The Grey Labyrinth Forum Index -> Science, Art, and Culture All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Site Design by Wx3