[Mailman-Developers] [lm@bitmover.com: Re: FW: Staging System - Requirements Meeting Recap]

Larry McVoy lm@bitmover.com
Wed, 13 Oct 1999 18:17:54 -0700


hey there,
	two things: (a) is mailman better than majordomo?  I assume it must
be right, or you wouldn't be putting all this effort into it.  I'm using
majordomo now and I keep meaning to look for something that has better spam
filters.  Forgive the stupid questions, but should I switch?

(b) JC has told me that you are considering maybe using BitKeeper for mailman
development.  I had a call from a large company today and they sent in their
requirements and I spent a bunch of time answering them.  It might be worth
it for you to skim through the questions and answers, I suspect there is some
overlap.

I'd love it if it turned out that BK was useful to you (and I'd be
equally bummed to find out it didn't work for you; if that's the case I
want to know why: if there are technical reasons for you not using it,
those will almost certainly be fixed, you guys have similar issues to
everyone else so that means if it doesn't work for you, it doesn't work
for very many other people).

Finally, if you have questions, you can mail to me and/or dev@bitmover.com
.  The dev alias is all the people hacking on BK right now.

Cheers,

--lm

-----Forwarded message-----

Hi folks, here are BitMover's responses to your requirements.  We think
BitKeeper might be a good fit for you, let us know what you think.

> Open Discussion - Requirements
>   a.. Direct FTP access

We provide SSH (secure remote shell), RSH (BSD remote shell), and SMTP
(email) as transports for moving stuff around.  We could do a FTP transport
but it seems questionable given that ssh is faster and more secure.  If the
issue is Windows, no problem, we support ssh on Windows.  Use it all the
time, in fact, between Windows and Unix.

Oh, I forgot, we also work in local and networked file systems (NFS, SMB)
and do resyncs between Unix and Windows using SMB.

>   b.. Graphical view of the file tree -or- dynamic tree structure

We actually don't have this right now.   But see the checkin response.
We have it on Windows because we integrate with their GUI glop, but
that's not much help.

If we get far enough along and this is a go/no go thing for you, and you're
buying 10,000 seats (:-) then we'll write one.  Seriously, it's on the list.

>   c.. Check-in and check-out facilities

Got 'em.  Both command line and graphical.  The command line stuff is like so

    bk new file		- check a file into the system for the first time
    			- aliases: bk delta -i, bk ci -i, bk admin -i
    bk edit file	- check out a file locked for modification
    			- aliases bk co -l, bk get -e
    bk ci file		- check in modifications to a file
    			- aliases: bk delta
    
    You can do all of the above on the entire tree, a sub tree, an explict list:

    bk -r new 		- checks in everything it finds
    bk -r edit		- locks everything
    bk -r ci -yComment	- checks in everything

    But the best tool is citool, which finds everything you've modified,
    lists in a graphical tool, and shows you the diffs as you're doing the
    check in.  See http://www.bitkeeper.com/citool.html

>   d.. Open ended client support allowing developers to use any editing
> tool - FTP to local machine with check-out - edit with whatever tool you
> wish - FTP upload to server with check-in

That's exactly the right answer.  Use any editor you want.  What you get with
BK is the revision history files locally.  If you've worked with RCS or SCCS,
in many respects, BK is like that with all the added stuff you need to have
multiple copies of the revision history.  So every playpen is a repository,
you can lock, edit, checkin, browse, etc., everything locally.  No need for
network connectivity except when you want to update your tree or the other
tree.

Oh, and you can update "sideways".  One developer can slurp in the
changes in a another developer's tree directly, without going through
the "master" or "shared" tree.   In CVS terms, I don't have to go back
to the master repository, I can talk directly to you.  The hierarchical
nature of the system is strictly a convention thing, you can resync from
anywhere to anywhere.


>   e.. Ability to cluster files into "Change Sets" for version control and
> configuration management

Chuckle.  We have that in spades.  We don't let you move data between playpens
unless you've put it in a changeset (yeah, you could get the diffs and apply
them with patch, we can't stop you, but if you want the system to do it,
you have to "commit" whatever you have done to a changeset).

The normal way you do this is when you pop into citool, there will be
N+1 entries, where N are the files with mods, and the Nth+1 entry is
the ChangeSet file.  If you comment the ChangeSet file, you are creating 
a ChangeSet.   The "diffs" shown for the ChangeSet file are the comments
you have just typed in on all the other files like so

    src/foo.c
	Fix a bug in arg processing
    src/foo.h
	Include getopt.h

with the idea being that the ChangeSet comments should be the idea or
concept you just did, while the per file comments are implementation 
details.

>   f.. Allow user selectable update reports on one or more branches of the
> development system

I think this is what we call a LOD, but I need clarification.  What exactly
do you want?

>   g.. Ability for CGI to be accessible and scriptable from a remote shell -
> should support Mac/PC/Linux/Unix/etc.

Is this a source management system issue?  I don't understand.  And one huge
bummer - we don't support the Mac; the system depends pretty heavily on
various Unix tools (sed, sh, tr, diff, etc), which we provide for Windows
but haven't worked up the ompf to do on the Mac.  If the Mac is a requirement,
we can support MacOS X no problem, that's Unix, but MacOS 8 is a real drag.
Let me know, that might be a show stopper.  There are ways we can work
around this, using appletalk - you could do the source management crud on
Linux and the editing on a Mac.  Definitely a hack but I can relate to
the requirement - the Mac has a bunch of useful tools for Web folks.

Long term we will do a port.  But that isn't likely to happen in the next
few months.

>   h.. Change Logs

You get these for free, it's the ChangeSet comment history.

>   i.. Concurrent Development of files - requires intelligent merging agent
> for coordinating updates from multiple developers (CVS does this now)

We do this better than anyone.  Period.  Everyone else screws up your
history.  Here's how: you have two developers A and B.  They both do CVS
update and have the same tree.  They both modify file foo.c.  A checks
in first, so B "lost the race" and has to merge.  What CVS will check in
is not B's changes, it's B's changes plus whatever B had to do to merge.
That is **two events** collapsed into one.  Why is this a big deal?  Undo.
Suppose B's stuff was good and A's stuff wasn't.  We can reconstruct
B's stuff without the merge.  CVS can't.  It lost the information.

Look at http://www.bitkeeper.com/sccstool.html - what you are seeing is the
race and the merge.  I'm "lm" and "awc" is my Windows guy.  We were working
parallel and the graph that you are seeing is the revision history after 
we merged.  1.89 is the merge delta, 1.86.1.1 and 1.86.1.2 are my deltas.
If I want to reconstruct just my work, I do an undo and tell it which
branch I want gone.

And we do all this automagically.  That tree of changes is a straight line,
no branches tree as far as the user is concerned.  The branches are created
automatically when changes are merged in, you never have to do that.

>   j.. Access Control for each user of the development environment: Per user
> basis and Per file basis

We don't do this because this is an operating system issue.  How you achieve
the same effect is to restrict access using standard Unix file permissions
on the master repository.

>   k.. Provide secure remote access to environment either through secure CRT,
> SSL, domain restrictions, etc.

SSH.

>   l.. Ability to differentiate gear, products and templates for exclusive
> use (for one private label) and public use (all private labels)

I don't understand this requirement.

> Roles and Responsibilities
> Engineering
>   a.. Priviledged access (on a user by user basis) to everything on the site
> (cgi, kernels, html, graphics, etc.)

This done through the OS.   If you have an account and you can read and 
write the files, then you can read and write the files.

>   b.. Utilize a versioning source control system (SCS) for updates

We provide something which is file format compatible with SCCS, AT&T's
original revision control system from the 70's.  A lot of people question
this, there are claims that the SCCS file format is worse than RCS.
That's not true, in fact, the opposite is true.  RCS has one potential
advantage that they don't even use: they store the most recent delta
as a clear file and all the previous ones as diffs (backward diffs
for going up the trunk and forward diffs for going down the branches).
If RCS stored an offset and a size in the file, then they could seek to
to the clear text and write the file out in one system call.  They don't
do that so they end up reading the whole file anyway (or most of it, it
depends).  Whatever.  Someone could modify RCS to do what I said and it
would still suck.  Here's why:

. No checksum.  SCCS checksums the file and verifies the checksum every
  time you get the file.  BK adds an additional per delta checksum.  
  The point is that you put your IP into a system; that system should
  guard that IP very carefully.  RCS makes a performance vs integrity
  tradeoff which will end up screwin you in the long run.

. Annotation.  SCCS can trivially give you a copy of the file with each
  line prefixed by any combination of the revision/user/date which added
  that line.  In addition, BK can check out a copy of the file with 
  every line in every revision in the file.  Think about that.  You
  know that somebody changed something and you know the string they 
  changed but you don't know when.  In BK you can find out by doing this

      bk sccscat -mu foo.c | grep string

. ChangeSets.  SCCS is a changeset engine, people just don't know it.  What
  that means is that I can edit a file on one branch and say "I also want
  that delta over there which is on a different branch".  SCCS will happily
  include it.  You can creat new deltas and include/exclude any arbitrary
  list of deltas.  It may not be obvious, but this is really cool and has
  far reaching ramifications.  Not only can RCS not do this, if you tried
  to kludge it in, it would grow the file linearly for each include.  In
  SCCS, including or excluding a delta costs you about 4 bytes per included
  delta.   In other words, you can construct multiple different views of 
  your data and it doesn't cost you.  Ping me in the conference call about
  this if I did a poor job explaining it, it is profoundly important.  It's
  what makes SCCS a ChangeSet engine and what makes RCS not a ChangeSet
  engine.

>   c.. Each engineer works in an independant workspace (currently known as
> playpens)

Yup.  Each engineer can have as many playpens as s/he wants.  Each are
fully independent and get this: NO NO NO environment variables.  You work
in a playpen by saying

    cd ~/playpen/src
    bk vi foo.c
 
If you want to work in a different playpen, you just say

    cd ~/different_playpen/src
    bk vi bar.c

Environment variables suck.  Just say no.  Dare to break the environment
variable habit :-)

>   d.. Depending on the engineer's privileges - able to push edits to staging
> server at the branch or global levels

If they have a login and write permissions on the staging area, they can.
I really felt no need to reinvent Unix file permissions.  Yeah, this
screws the NT people but they can emulate the same thing by limiting access
to the machine which has the staging area.

> Web Development
>   a.. Priviledged access (on a user by user basis) to limited content on the
> site (html, graphics, etc.)
>   b.. Utilize a versioning source control system (SCS) for updates
>   c.. Each developer works in an independant workspace (currently known as
> playpens)
>   d.. Depending on the developers privileges - able to push edits to staging
> server at the branch or global levels

Same as above.

> Quality Assurance
>   a.. Utilizes a selective build configuration management system (CMS) that
> allows Q/A to selectivey test and build portions of the site

We're weak here in some regards.  We don't support partial repositories. 
In other words, when you do a resync, you get the whole tree.  To deal 
with this, you will naturally split your project up into chunks.  Each
of these chunks is a repository.  So far so good, but the one bummer is
when you want to share data between two repositories.  We currently don't
have any way for data to be in two different "chunks" (we call 'em projects)
at the same time.

This needs more explanation so please ping me during the conference call.

>   b.. Ability to rollback change sets and individual files

Chuckle.  You bet.  The easiest (but somewhat slow) way to roll back is
like so:  suppose you want to roll back to ChangeSet 1.123 (which was
conviently tagged with alpha2).

	$ bk resync -r..alpha2 master alhpa2-test

That will create a repository which is identical to the master repository 
when alpha2 went in (by the way, the tag is just for clarity, anywhere you
use a tag you can use a revision).

The other way is suppose you had a tree with ... alpha2 - 1.124 - 1.125
and you wanted alpha2.  You can dothis

    $ bk undo 1.124,1.125

and that will DESTROY those two changesets.  You'd better have a copy of
them somewhere if you want 'em back).  

>   c.. Tie in builds with bug tracking system to complete a closed loop
> tracking process

Busted.  We want to do this but this is a 2.0 or perhaps 3.0 before we
have it.  We have a plan for doing it but right now we don't have diddly.

>   d.. Test and submit builds for Go Live!

???

> Release Management
>   a.. Schedules and approves builds for go live
>   b.. Coordinates builds with Q/A for testing

Seems OK.

> Private Label Partners
>   a.. Restricted access to their branch
>   b.. Access control based on a case by case basis

I don't know what these are.

---------------------------------------------

One thing you forgot to ask about is file renames.  This is another place
where people fall down.  We don't (surprise!).  We handle file pathname
changes identically to file content changes.  Pathnames are revisioned and
propogate.  Consider a couple of test cases:

	You		Me		Resync you to me
	mv foo bar	nothing		moves foo to bar in my tree

	change foo	mv foo bar	applies change in foo to bar in my tree
	
	mv foo bar	mv foo blech	prompts you with name conflict

This is not a big deal until you need it to work but then it is a huge deal.
It can bring your develpment to a halt for days while your engineers 
unscramble the mess.  We just make that problem go away.

You also forgot file permissions.  We pick up the permissions as of the
time that the file was originally checked in "bk new".  After that, if you
want to change them, you just say "bk chmod 755 foo.sh" and it saves the
modes.  On windows, only the top bits (owner permissions) are used, but
it doesn't stomp on the lower bits.

> Possible Vendors
>   a.. Continuus - Continuus http://www.continuus.com/
>   b.. Interwoven - Teamsite http://www.interwoven.com/
>   c.. Clear Case - Need Vendor Name
		Rational Software, http://www.rational.com
>   d.. Bit Keeper - Need Vendor Name
	    BitMover, Inc.  And it is "BitKeeper" if you want to be picky.
	    http://www.bitkeeper.com

Also consider the following:

   CVS (it's free and I can show you what you could do to sort of have some
   of these features, not all, but some).

   Perforce, http://www.perforce.com - I know Chris, great guy, nice little
   tool.  It doesn't do what you want but it is quite popular and you should
   know it exists.

   TrueChange from TrueSoft, http://www.truesoft.com - this is the only
   commercially available ChangeSet engine other than BitKeeper.

Cheers,

Larry McVoy
President, BitMover, Inc.

-----End of forwarded message-----

-- 
---
Larry McVoy            	   lm@bitmover.com           http://www.bitmover.com/lm