October 27, 2003
AdAssassin
For Internet Explorer on Windows, this is the best and smartest pop-up killer in the biz: AdAssassin by my buddy Ilya.
Posted by juliob at
02:19 PM
October 25, 2003
CPAN: Installing Specific Versions of Packages
The question is: "is there any way of installing a specific version of a CPAN module?"
The answer is: yes, but you need to specify the entire "path" of a module, e.g. S/SA/SAMPO/Net_SSLeay.pm-1.17.tar.gz
Posted by juliob at
12:26 PM
October 21, 2003
DB_File 1.806 for SpamAssassin 2.60
SpamAssassin 2.60 requires the installation of DB_File for its Bayes filter.
Here's the problem I ran into when I compiled it. It happens during testing:
DB_File needs compatible versions of libdb & db.h
you have db.h version 4.0.14 and libdb version 2.4.14
The README file tells me that I'm pretty much screwed since I use perl 5.6.0 and it includes a version of the Berkeley DB which is going to make it hard for me to match up with db.h.
I give up on this installation and wait for the day when my local perl installation will be magically updated to 5.6.1 or later. In the meantime, my spamassassin will run without Bayes.
Posted by juliob at
03:43 PM
October 15, 2003
Firewall Reviews 2003
This is a comparison of the personal firewall softwares for a Windows PC.
Kerio remains the best firewall for my needs. The free version has all the
features of the paid version and it's very easy to use even when creating
advanced rules. However, the current beta version has bugs,
I would recommend waiting until it's released. Meanwhile, Kerio 2.1.5 should
work very well.
Second choice is Outpost. I did not like the fact that its interface blends
with Windows XP (other people may like that). I want the alerts to stand out
from all other windows I may have on my desktop. It recognized common
applications and already had rules for them. But it had a bug where the ports
were not stealth even though the option was enabled. This firewall may be a
good choice for non-expert users.
Third choice is Zone Alarm Pro (or Plus if not interested in privacy features
such as cookie and ad blocking). The free version lacks the expert rules
feature. But keep in mind that Zone Alarm is not trivial to configure.
If interested in IM security, IMSecure Pro should be considered either
individually or in a bundle with Zone Alarm.
For a non-expert user the choices should be: Kerio, Outpost, ZoneAlarm.
Details (in the order of my ranking):
1) Kerio 4.0.4 (beta):
+ my long time favorite
+ all the features I am looking for
+ free home version is identical to the paid version, so you get all
the features
- it's beta, it still has bugs
2) Outpost 2.0 Pro:
+ pluggable modules, some free are available, such as spyware blocking,
ad blocking, etc.
+ features on par with Kerio and ZoneAlarm
+ recognizes common applications and prompts you to add the rules for
those apps (on an app by app basis or all known apps). Bonus: It even
recognizes MyIE :)
- interface blends with XP (I don't like it because I want my alert
windows to look different from regular windows so I pay attention and
not just click OK by accident)
- ports appear as closed, not stealth, even though it has an option to
stealth ports -- it must not work right as I have it enabled
- free home version lacks features, Pro is the way to go here
3) ZoneAlarm 4.0 Pro:
- a little hard to configure, it has too many options, sometimes not in
obvious places
- free version lacks expert rules support
- doesn't have global setting to allow DNS, a custom rule needs to be made
- it blocked IP printing by default
+ nice how you can close multiple alerts with one click (though you can
miss other important alerts if not paying attention)
+ should consider a bundle with IMSecure (secure IM for all protocols)
4) Sygate 5.0 Pro:
+ very popular, a lot of people swear by it
+ free home version available, though not all features available
- not my impression, it's easy to use for regular users, but hard for
advanced, hard-core users
- hard to make advanced rules
- doesn't have the concept of a trusted area
- the UI is kinda weird, sometimes windows are blocked to a fixed size
and you have to scroll left to right a lot
5) Tiny 5.0:
+ the most advanced of them all
+ can protect files and registry - very very hard to configure
- not tiny, it's a monster
- didn't block anything out of the box but I must have selected
something wrong during the install wizard
- UI is HTML based, it's SLOW
- you can write your own UI for it (some people think this is +)
6) [Name deleted because they spammed my blog with advertisement] 3.11:
- it's an amateur firewall, needs to grow a lot more
- doesn't protect against application replacement (MD5 checksum)
- no trusted zone
- translation to English is Jackie Chan style
- no free version, but it's the cheapest of all of them ($19.95)
+ pop-up stopper
+ some spyware cleanup
Posted by dracula at
01:26 PM
October 07, 2003
Designing Typed DataSets in Visual C#'s XML Designer
These are my personal notes for designing typed datasets in Visual C#.NET's XML Designer.
Naming Conventions
- Name your dataset file and your dataSetName with the "Set" postfix, e.g. CustomerSet.xsd
- Pluralize the name of your Table elements, e.g. Customers
- Capitalize the names of your elements, which corresponds to DB Schema conventions although XML elements tend to be lower-cased.
Inserting New Rows Into a Typed DataSet
The Visual Studio documentation is incorrect where it explains how to insert new rows into a typed dataset.
It gives the example:
// C#
DataRow anyRow = DatasetName.ExistingTable.NewRow();
anyRow.FirstName = "Jay";
anyRow.LastName = "Stevens";
ExistingTable.Rows.Add(anyRow);
But this is incorrect as the DataRow type is generic and does not contain the FirstName or LastName fields. A more specific type and a cast are necessary, as in:
// C#
ExistingDataSet.ExistingTableRow anyRow =
(ExistingDataSet.ExistingTableRow) DatasetName.ExistingTable.NewRow();
anyRow.FirstName = "Jay";
anyRow.LastName = "Stevens";
ExistingTable.Rows.Add(anyRow);
But if it's going to be this complicated, you might as well follow the alternative instructions that work for both untyped and typed datasets. See the documentation.
Inserting Related Rows Into Related Tables
If you create an XML Schema that has 2 tables in a parent-child relationship, then a DataRelation will be automatically created for you by Designer. You won't see it, but a column in the child table will automatically be created for a foreign key to match up with the primary key in the parent table. The question is: how do you insert a new row into both tables so that the relationship is maintained? Do the following in the specific order:
- Create the parent row, set the column values, and add it to the parent table
- Create the child row, set the column values
- Call SetParent() on the child row with the parent as the argument, e.g. childRow.SetParent(parentRow)
- Add the child row
I could not find how to do this in the documentation either.
Assigning a DataSource to a DataView at run-time
If a DataSource is created at run-time in the Text Editor as opposed to design time in the Designer, and you want to have a DataView associated with it, this DataView should also be initialized at run-time. The problem is if you assign the source to the DataView after its EndInit() method is called, then RowFilter won't work.
Posted by juliob at
01:54 AM
October 06, 2003
XML Schema Design Tips
Designing an XML Schema is not easy. Often, there seem to be many ways of achieving the same result, but generally one finds that some choices are better than others. This article is aimed at reducing the amount of inevitable trial-and-error to make the best decisions. It is written at a high-level and assumes that you are familiar with XML Schema and its terminology and that you've attempted to write your own schema. I provide few examples or illustrations.
Choices
First, one needs to understand the differences between similar concepts and keywords before deciding which path to take when drafting a schema. Here is a discussion of some of the more complicated choices one has to make.
- Complex Types vs. Model Groups (better)
When defining a group of elements that can be referenced in different places in your schema, you need to either create a Complex Type or a Group (we'll ignore Simple Types for this discussion). Which to use is not always clear. You should read the article W3C XML Schema Made Simple. To summarize, Complex Types are complicated, so Groups should be used as much as possible. On the other hand, whereas Groups only specify Elements, Complex Types allow for the specification of Attributes as well.
- Global Type vs. Local Type (better)
- In order to avoid a complicated schema that is hard to navigate, you should avoid converting Local Types to Global Types unless you have to. Local Types abstract your design better. The only Types that need to be globalized are those involved in recursion, needed by multiple Elements, or that serve as the Base Type for other derived Types.
- That said, it might be useful to have Global Types for extensibility, but you don't need to worry about that when you're designing the first version of your schema. Once you're done, you can think about globalizing certain Types for extensibility.
- Global Type as Single Base (avoid)
One reason to globalize Types might be to architect single-rooted Type hierarchies so as to factor out as much definition as possible for reuse, since only Global Types can be inherited. Especially if you come from an objected-oriented programming background, you might decide early in your design process to figure out a list of Types that you can make Global to form as Base for all other Types. However, this objective of creating Global Types for definition reuse does not work well.
- Like xfront.com says, anything over 3 levels of derivation is too confusing and unnecessary. I found that to be the case as well and now strive for only a single level of derivation.
- Sometimes it's just not possible for two closely related Types to share a common base Type. Take the case of a Complex Empty Type and a Simple Type. For example, you might want to keep <elemWithNoArg /> and <elemWithArg>arg</elemWithArg> close together, but their Type definitions cannot be specified in such a way as to share a common Type hierarchy since the XML Specification does not allow it.
- If all you want to do is share Attributes between some of your Types, use Attribute Groups instead of Global Types. This is a more flexible approach anyway as you might found out later that one of your Types should not accept some of the Attributes that you had anticipated to be reusable.
- Global Type (better) vs. Global Element
- Between a Global Element and a Global Type, it's better to make a Global Type because it doesn't lock you into a single Element name.
- In addition, having a Global Element is not equivalent to using a Global Type. It completely changes the semantics of the schema. Unlike a Global Type, a Global Element allows an XML author to specify different types of root Elements. But as one can judge from the XML Schemas drafted these days, one generally wants to allow only one root Element. So only define one Global Element, and use Global Types and Groups for everything else.
Attribute Defaults
Specifying Attribute Defaults in an XML Schema is highly controversial.
Attribute Defaults have nothing to do with validation and are instead part of the PVSI (Post-Validation Schema Infoset) process which is not easy to do with Xerces.
Note also that it is not supported by Relax NG, in case the flexibility to convert your schema to other formats is desired.
In other words, don't use them. Just use the
ATTS StyleSheet (or a modified version of it) and a simple XML file that contains the Attribute Defaults.
Reminders
XML Schema Terminology
Making sense of XML Schema (XS) terminology is like shitting a bowling ball. Here's some help.
General Terminology Guidelines
- Type refers to everything about an Element or Attribute, whereas Content only refers to what's in between the opening and closing tag of Elements. In other words, unlike Type, Content describes an Element's child Elements or Text or the fact that the Element is Empty; it does not cover whether the Element has Attributes or not.
- A Simple Element is an Element that only contains Text, whereas a Complex Element is any other type of Element, i.e. an Element that is Empty, or has Attributes, or has child Elements, or any combination of the above and/or Text.
- When one talks of Simple Elements or Complex Elements, one means Elements of Simple Type and Elements of Complex Type. Note that by itself, the term Simple Type could be applied to Elements or Attributes.
Terminology
- Simple Element and Simple Type
Definition: a Simple Element is an XML Element that can only contain Text, but not Child Elements nor Attributes.
E.g.: XML: <dateborn>1968-03-27</dateborn>
XS: <xs:element name="dateborn" type="xs:date" />
NOTE: for an Element to only contain Text, its type has to be either a pre-defined XML Schema datatype or a custom Simple Type (see next definition).
- Custom Simple Type for an Element or Attribute
Definition: A Custom Simple Type is a new Simple Type based on a List of values or on a Restriction or Union of Simple Type(s). This or these base Simple Types can be pre-defined XML Schema datatypse or some other Custom Simple Types. The XS element xs:simpleType comes into play when you create a new Simple Type.
E.g.: XML: <age>100</age>
XS: <xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
- Complex Element and Complex Type
Definition: A Complex Element is any XML Element that cannot be considered a Simple Element. There are 4 types of Complex Elements.
- Complex Empty Element
Definition: Element that has null Content (but can optionally have Attributes)
E.g.: XML: <product pid="1345"/>
XS: <xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>
NOTE: Contrast with <xs:element name="product"/>, which counter-intuitively does not specify an Empty Element; it specifies an Element of any Type and any Content.
- Complex Elements-Only Element
Definition: Element that can contain only other Elements (and optionally Attributes)
E.g.: XML: <employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
XS: <xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
- Complex Mixed-Content Element
Definition: Element that can contain Elements and Text (and optionally Attributes)
E.g.: XML: <letter>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
XS: <xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
- Complex Text-Only Element
Definition: Element that can only contain Text (and optionally Attributes).
E.g.: XML: <shoesize country="france">35</shoesize>
XS: <xs:element name="shoesize">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Note: This specifies an Element of Complex Type and Simple Content. Because it can contain Attributes, the Element is of Complex Type. But because it does not contain other elements, it is of Simple Content.
Note: It follows, then, that if there are no Attributes, you might as well use a Simple Type, which would be equivalent.
Note: The XS syntax inside xs:simpleContent is like that of xs:simpleType in that you need to declare a xs:restriction/xs:union/xs:list Element inside, but unlike xs:simpleType, you can also declare an xs:extension Element inside so that Attributes may be defined.
Note: If you want to both restrict the Type of the Text (xs:restriction) and also add an Attribute (xs:extension), you're going to have to separately define an intermediary (xs:restriction) Simple Type that you would then use as base for your xs:simpleContent's xs:extension.
- Complex Content
Definition: Note that there isn't a one-to-one relationship between "Complex Elements" and "Elements with Complex Content". Complex Content refers to what can be specified inside the opening and closing tags of the first 3 of the 4 Complex Elements defined above; Complex Text-Only Element is the exception.
Posted by juliob at
03:36 AM
October 05, 2003
XML Schema: String Types
The XML Schema specification comes with many different types of strings. What's the difference between them? Here's a quick reference to distinguish them.
Here, the string types are organized hierarchically in accordance with their inheritance structure.
- string Any unicode characters
- normalizedString A string without \r, \n, \t
- token A normalizedString without leading or trailing spaces and without 2 or more consecutive internal spaces
- NMTOKEN A token with any mixture letter, digit, or the characters -\._:
- Name A token that's like an NMTOKEN whose initial character is a letter or one of the characters _:
- NCName A Name without the character :
- ID A NCName; only used for attributes
- IDREF A NCName; only used for attributes
- ENTITY A NCName; only used for attributes
- QName An NCName optionally prefixed by an NCName:
- Notation An NCName
Posted by juliob at
11:22 PM
October 04, 2003
Visual C# .NET Gotchas
Design Gotchas
- Properties of New project: When creating a new project that will be referred to by the main project in your solution, you might wonder what properties can be customized so as to keep the projects more cohesive. The following might be obvious to you but not to a novice like me: the default namespace property can be set to be the same as the main project's; however, the assembly name has to be unique.
- Typed DataSet in a Separate Project: Separating your code into separate projects can have unexpected consequences. When you create a typed dataset by designing an XML Schema using the built-in XML Designer, you would expect your dataset to be listed as a referenced Typed DataSet when dragging the Dataset icon from the Toolbox onto a form. This does happen if the dataset and the form are in the same project. But if they're in separate projects, you won't be able to find it, even after you've properly added a reference to the dataset's project. What is not explained in the documentation is that you will need to build the dataset's project first, before it will listed by "Referenced Datasets..." in the drop-down.
- Disappearing User Controls from Separate Project: If you define a user control in a separate project, you might suffer from disappearing control instances when building. Take the situation where you define a new user control in a project separate from your main form's project. Assume you already had a proper reference from the main form's project to this separate control project. If your main form contains an instance of your new user control and you rebuild, you might see the control instance disappear in front of your eyes. What you need to do to fix this problem is delete the reference and re-create it. This might even be considered a bug.
- Missing Form Resize Handles: If you have a user control that takes up all the space of your form, you won't be able to find the resize handles of either the form or the occupying control. So how do you resize them? No idea. Just avoid letting controls take up all the space in the form in the first place; leave a bit of margin at least on one side.
Visual Studio .NET 2002 Bugs
These are some of the annoying bugs that I encountered when using this 7.0 version of Microsoft Visual Studio .NET.
- Intelli-non-sense: As you're typing code into the text editor, Visual Studio (VS) sometimes seems to lose track of how to parse it. What happens is that all the syntax coloring disappears and Intellisense's completion features stop functioning. I'm not exactly sure what triggers this annoying behavior.
- Designer code corruption: Once in a while, VS will actually corrupt code. Stupidly, but fortunately, it seems to only corrupt its own designer-generated code. For example,
this.button2.Anchor = (System.Windows.Forms.AnchorStyles.Bottom |
System.Windows.Forms.AnchorStyles.Right);
will become
this.button2.Anchor = (System.Windows.Forms.AnchorStyles.Bottom |
stem.Windows.Forms.AnchorStyles.Right);
You find out when you compile and get a syntax error.
- TabControl in Designer: Designer has the nice feature that allows you to click on TabPages of a TabControl as it would behave at run-time. You can even click on the left and right arrows. But somtimes, the TabControl stops working and you can no longer scroll through or click on the tabs.
- Failed compilation due to locked DLL: Sometimes it takes two consecutive compilations to get the solution buil because the compiler says that the DLL is lockedt. This can happen if the Object Browser is opened. So just close it.
Posted by juliob at
10:01 PM
C# Language Notes
Exploring C# from a background of C++ and Java, I found that most C# language features are fairly intuitive. Here's a list of issues that I found to be exceptions.
Confusing Distinctions
- readonly vs. const: The difference between readonly and const is not explained clearly in the VS.NET documentation. It makes more sense once you realize that const is inlined at compile-time while readonly is evaluated at run-time. See the excellent Comparative Overview of C#.
- event vs. delegate: Another distinction is not made clear by the documentation, this time between an event and a delegate-typed object. The article C# events vs. delegates does a great job of clarifying the differences.
- Destructors vs. Finalize(): I'm still not sure of the exact difference between a destructor and the Finalize() method. One functional distinction is that the destructor automatically invokes the base's destructor. In light of that, it makes sense that a Finalize() method can be found on the Object class since the Object class has no base class. But are we supposed to write destructors or a Finalize() in our classes?
It seems that the recommendation for user classes is to write destructors. But I notice that the Finalize() method is mentioned in classes other than the Object class. So which is it?
C# vs. Java Quick Reference
Here are some comparisons between the two similar languages that are not mentioned by the article
Comparative Overview of C#.
- Types:In Java, to get a Class object, you use either Class.forName("ClassNameString") or object.GetClass(). You can then call getName() or toString() on the Class object to get the name in string form. In C#, to get a Type object, you use either typeof(ClassNameString) or object.GetType(), which unlike Java work on primitive types as well. You can then reference properties Name or FullName or call ToString() on the Type object to get the name in string form.
- As:From the documentation:
The as operator is like a cast except that it yields null on conversion failure instead of raising an exception. More formally, an expression of the form:
expression as type
is equivalent to:
expression is type ? (type)expression : (type)null
except that expression is evaluated only once.
Posted by juliob at
09:36 PM
October 01, 2003
Perl 5 Arrays of Arrays
This is a demonstration of perl5's arrays of arrays. It was written to help a friend.
You have to be comfortable with simple data structures first. One key syntax difference between perl and other languages when it comes to arrays is this: an array is @a but an element of an array is $a[5], not @a[5] as you would expect.
This is counter-intuitive as you have to use 2 pieces of syntax to get at your element: [] and $. $ signifies what kind of element you expect out of a[5]: a scalar.
Sample code
#!/usr/bin/perl
my @input = ( [1, 2, 3], [4, 5, 6] );
# [ 1, 2, 3 ] is an anonymous array.
# I could have this instead:
#@line1 = ( 1, 2, 3 );
#@line2 = ( 4, 5, 6 );
#my @input = ( \@line1, \@line2 );
my @data;
my $i = 0;
# The 'my' in the loop is imperative, otherwise, each row will erase the
# previous one.
# The braces in @{ ... } are imperative because of operator precedence
# against the [ ] brackets
while (my @dataRow = @{$input[$i++]}) {
print @dataRow, "\n";
# This stores a pointer to the @dataRow array, i.e. its address or
# reference.
# So essentially, you get an array of addresses of arrays, which
# effectively gives you an array of arrays. Note that each sub-array
# (each row) can be different sizes unlike traditional 2-dimensional
# arrays.
push @data, \@dataRow;
}
# Now how do we get the data out?
# So in our array of arrays, if you say @data[5] you're saying
# data[5] which you want to be an array. but that's not the way we
# stored things; we stored things as addresses of (or references to) arrays.
# So you have to say $data[5] to get the address of the array, and use @ to
# get the array, i.e. @{$data[5]}
# This is called dereferencing.
print "Output loop\n";
foreach (@data) {
print @{$_}, "\n";
# I could have this instead, but less clear:
#print @$_, "\n";
print ${$_}[0], "\n";
# I could have this instead, but less clear:
#print $$_[0], "\n";
}
# Or equivalently
print "Explicit Output\n";
print @{$data[0]}, "\n";
print ${$data[0]}[0], "\n";
print @{$data[1]}, "\n";
print ${$data[1]}[0], "\n";
Posted by juliob at
06:33 PM
License: